31 Dec, 2019

1 commit

  • [ Upstream commit daebf24a8e8c6064cba3a330db9fe9376a137d2c ]

    Currently the Linux Kernel Memory Model gives an incorrect response
    for the following litmus test:

    C plain-WWC

    {}

    P0(int *x)
    {
    WRITE_ONCE(*x, 2);
    }

    P1(int *x, int *y)
    {
    int r1;
    int r2;
    int r3;

    r1 = READ_ONCE(*x);
    if (r1 == 2) {
    smp_rmb();
    r2 = *x;
    }
    smp_rmb();
    r3 = READ_ONCE(*x);
    WRITE_ONCE(*y, r3 - 1);
    }

    P2(int *x, int *y)
    {
    int r4;

    r4 = READ_ONCE(*y);
    if (r4 > 0)
    WRITE_ONCE(*x, 1);
    }

    exists (x=2 /\ 1:r2=2 /\ 2:r4=1)

    The memory model says that the plain read of *x in P1 races with the
    WRITE_ONCE(*x) in P2.

    The problem is that we have a write W and a read R related by neither
    fre or rfe, but rather W ->coe W' ->rfe R, where W' is an intermediate
    write (the WRITE_ONCE() in P0). In this situation there is no
    particular ordering between W and R, so either a wr-vis link from W to
    R or an rw-xbstar link from R to W would prove that the accesses
    aren't concurrent.

    But the LKMM only looks for a wr-vis link, which is equivalent to
    assuming that W must execute before R. This is not necessarily true
    on non-multicopy-atomic systems, as the WWC pattern demonstrates.

    This patch changes the LKMM to accept either a wr-vis or a reverse
    rw-xbstar link as a proof of non-concurrency.

    Signed-off-by: Alan Stern
    Acked-by: Andrea Parri
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Sasha Levin

    Alan Stern
     

10 Aug, 2019

2 commits

  • The formal memory consistency model has added support for plain accesses
    (and data races). While updating the informal documentation to describe
    this addition to the model is highly desirable and important future work,
    update the informal documentation to at least acknowledge such addition.

    Signed-off-by: Andrea Parri
    Cc: Will Deacon
    Cc: Peter Zijlstra
    Cc: Boqun Feng
    Cc: Nicholas Piggin
    Cc: David Howells
    Cc: Jade Alglave
    Cc: Luc Maranget
    Cc: "Paul E. McKenney"
    Cc: Akira Yokosawa
    Cc: Daniel Lustig
    Signed-off-by: Paul E. McKenney
    Acked-by: Alan Stern

    Andrea Parri
     
  • To reduce ambiguity in the more exotic ->prop ordering example, this
    commit uses the term cumul-fence instead of the term fence for the two
    fences, so that the implict ->rfe on loads/stores to Y are covered by
    the description.

    Link: https://lore.kernel.org/lkml/20190729121745.GA140682@google.com

    Suggested-by: Alan Stern
    Signed-off-by: Joel Fernandes (Google)
    Acked-by: Alan Stern
    Signed-off-by: Paul E. McKenney

    Joel Fernandes (Google)
     

01 Aug, 2019

1 commit


25 Jun, 2019

1 commit

  • Herbert Xu recently reported a problem concerning RCU and compiler
    barriers. In the course of discussing the problem, he put forth a
    litmus test which illustrated a serious defect in the Linux Kernel
    Memory Model's data-race-detection code [1].

    The defect was that the LKMM assumed visibility and executes-before
    ordering of plain accesses had to be mediated by marked accesses. In
    Herbert's litmus test this wasn't so, and the LKMM claimed the litmus
    test was allowed and contained a data race although neither is true.

    In fact, plain accesses can be ordered by fences even in the absence
    of marked accesses. In most cases this doesn't matter, because most
    fences only order accesses within a single thread. But the rcu-fence
    relation is different; it can order (and induce visibility between)
    accesses in different threads -- events which otherwise might be
    concurrent. This makes it relevant to data-race detection.

    This patch makes two changes to the memory model to incorporate the
    new insight:

    If a store is separated by a fence from another access,
    the store is necessarily visible to the other access (as
    reflected in the ww-vis and wr-vis relations). Similarly,
    if a load is separated by a fence from another access then
    the load necessarily executes before the other access (as
    reflected in the rw-xbstar relation).

    If a store is separated by a strong fence from a marked access
    then it is necessarily visible to any access that executes
    after the marked access (as reflected in the ww-vis and wr-vis
    relations).

    With these changes, the LKMM gives the desired result for Herbert's
    litmus test and other related ones [2].

    [1] https://lore.kernel.org/lkml/Pine.LNX.4.44L0.1906041026570.1731-100000@iolanthe.rowland.org/

    [2] https://github.com/paulmckrcu/litmus/blob/master/manual/plain/C-S-rcunoderef-1.litmus
    https://github.com/paulmckrcu/litmus/blob/master/manual/plain/C-S-rcunoderef-2.litmus
    https://github.com/paulmckrcu/litmus/blob/master/manual/plain/C-S-rcunoderef-3.litmus
    https://github.com/paulmckrcu/litmus/blob/master/manual/plain/C-S-rcunoderef-4.litmus
    https://github.com/paulmckrcu/litmus/blob/master/manual/plain/strong-vis.litmus

    Reported-by: Herbert Xu
    Signed-off-by: Alan Stern
    Acked-by: Andrea Parri
    Signed-off-by: Paul E. McKenney
    Tested-by: Akira Yokosawa

    Alan Stern
     

22 Jun, 2019

2 commits

  • The rcu-fence relation in the Linux Kernel Memory Model is not well
    named. It doesn't act like any other fence relation, in that it does
    not relate events before a fence to events after that fence. All it
    does is relate certain RCU events to one another (those that are
    ordered by the RCU Guarantee); this induces an actual
    strong-fence-like relation linking events preceding the first RCU
    event to those following the second.

    This patch renames rcu-fence, now called rcu-order. It adds a new
    definition of rcu-fence, something which should have been present all
    along because it is used in the rb relation. And it modifies the
    fence and strong-fence relations by making them incorporate the new
    rcu-fence.

    As a result of this change, there is no longer any need to define
    full-fence in the section for detecting data races. It can simply be
    replaced by the updated strong-fence relation.

    This change should have no effect on the operation of the memory model.

    Signed-off-by: Alan Stern
    Acked-by: Andrea Parri
    Signed-off-by: Paul E. McKenney

    Alan Stern
     
  • Commit 66be4e66a7f4 ("rcu: locking and unlocking need to always be at
    least barriers") added compiler barriers back into rcu_read_lock() and
    rcu_read_unlock(). Furthermore, srcu_read_lock() and
    srcu_read_unlock() have always contained compiler barriers.

    The Linux Kernel Memory Model ought to know about these barriers.
    This patch adds them into the memory model.

    Signed-off-by: Alan Stern
    Acked-by: Andrea Parri
    Signed-off-by: Paul E. McKenney

    Alan Stern
     

20 Jun, 2019

2 commits


28 May, 2019

3 commits

  • This patch adds data-race detection to the Linux-Kernel Memory Model.
    As part of this effort, support is added for:

    compiler barriers (the barrier() function), and

    a new Preserved Program Order term: (addr ; [Plain] ; wmb)

    Data races are marked with a special Flag warning in herd. It is
    not guaranteed that the model will provide accurate predictions when a
    data race is present.

    The patch does not include documentation for the data-race detection
    facility. The basic design has been explained in various emails, and
    a separate documentation patch will be submitted later.

    This work is based on an earlier formulation of data races for the
    LKMM by Andrea Parri.

    Signed-off-by: Alan Stern
    Reviewed-by: Andrea Parri
    Signed-off-by: Paul E. McKenney

    Alan Stern
     
  • This patch adds definitions for marked and plain accesses to the
    Linux-Kernel Memory Model. It also modifies the definitions of the
    existing parts of the model (including the cumul-fence, prop, hb, pb,
    and rb relations) so as to make them apply only to marked accesses.

    Signed-off-by: Alan Stern
    Reviewed-by: Andrea Parri
    Signed-off-by: Paul E. McKenney

    Alan Stern
     
  • This patch makes some slight alterations to linux-kernel.cat in
    preparation for adding support for data-race detection to the
    Linux-Kernel Memory Model.

    The definitions of relations involved in Acquire, Release, and
    unlock-lock ordering are moved up earlier in the source file.

    The rmb relation is factored through the new R4rmb class: the
    class of reads to which rmb will apply.

    The definition of the fence relation is moved earlier, and it
    is split up into read- and write-fences (rmb and wmb) and all
    the others.

    This should not make any functional changes.

    Signed-off-by: Alan Stern
    Reviewed-by: Andrea Parri
    Signed-off-by: Paul E. McKenney

    Alan Stern
     

05 Apr, 2019

1 commit


19 Mar, 2019

7 commits

  • Currently, herdtools version information appears no fewer than three
    times in the LKMM source, which is difficult to maintain. This commit
    therefore places the required version in one place, namely the
    tools/memory-model/README file.

    Signed-off-by: Andrea Parri
    Signed-off-by: Paul E. McKenney
    Acked-by: Alan Stern

    Andrea Parri
     
  • This commit checks that the return value of srcu_read_lock() is passed
    to the matching srcu_read_unlock(), where "matching" is determined by
    nesting. This check operates as follows:

    1. srcu_read_lock() creates an integer token, which is stored into
    the generated events.
    2. srcu_read_unlock() records its second (token) argument into the
    generated event.
    3. A new herd primitive 'different-values' filters out pairs of events
    with identical values from the relation passed as its argument.
    4. The bell file applies the above primitive to the (srcu)
    read-side-critical-section relation 'srcu-rscs' and flags non-empty
    results.

    BEWARE: Works only with herd version 7.51+6 and onwards.

    Signed-off-by: Luc Maranget
    Signed-off-by: Paul E. McKenney
    [ paulmck: Apply Andrea Parri's off-list feedback. ]
    Acked-by: Alan Stern

    Luc Maranget
     
  • The recent commit adding support for SRCU to the Linux Kernel Memory
    Model ended up changing the names and meanings of several relations.
    This patch updates the explanation.txt documentation file to reflect
    those changes.

    It also revises the statement of the RCU Guarantee to a more accurate
    form, and it adds a short paragraph mentioning the new support for SRCU.

    Signed-off-by: Alan Stern
    Cc: Akira Yokosawa
    Cc: Andrea Parri
    Cc: Boqun Feng
    Cc: Daniel Lustig
    Cc: David Howells
    Cc: Jade Alglave
    Cc: Luc Maranget
    Cc: Nicholas Piggin
    Cc: "Paul E. McKenney"
    Cc: Peter Zijlstra
    Cc: Will Deacon
    Signed-off-by: Paul E. McKenney
    Acked-by: Andrea Parri

    Alan Stern
     
  • This commit updates the section on LKMM limitations to no longer say
    that SRCU is not modeled, but instead describe how LKMM's modeling of
    SRCU departs from the Linux-kernel implementation.

    TL;DR: There is no known valid use case that cares about the Linux
    kernel's ability to have partially overlapping SRCU read-side critical
    sections.

    Signed-off-by: Paul E. McKenney
    Acked-by: Andrea Parri

    Paul E. McKenney
     
  • Add support for SRCU. Herd creates srcu events and linux-kernel.def
    associates them with three possible annotations (srcu-lock,
    srcu-unlock, and sync-srcu) corresponding to the API routines
    srcu_read_lock(), srcu_read_unlock(), and synchronize_srcu().

    The linux-kernel.bell file now declares the annotations
    and determines matching lock/unlock pairs delimiting SRCU read-side
    critical sections, and it also checks for synchronize_srcu() calls
    inside an RCU critical section (which would generate a "sleeping in
    atomic context" error in real kernel code). The linux-kernel.cat file
    now adds SRCU-induced ordering, analogous to the existing RCU-induced
    ordering, to the gp and rcu-fence relations.

    Curiously enough, these small changes to the model's .cat code are all
    that is needed to describe SRCU.

    Portions of this patch (linux-kernel.def and the first hunk in
    linux-kernel.bell) were written by Luc Maranget.

    Signed-off-by: Alan Stern
    CC: Luc Maranget
    Signed-off-by: Paul E. McKenney
    Tested-by: Andrea Parri

    Alan Stern
     
  • In preparation for adding support for SRCU, refactor the definitions
    of rcu-fence, rcu-rscsi, rcu-link, and rb by moving the po and po?
    terms from the first two to the second two. An rcu-gp relation is
    added; it is equivalent to gp with the po and po? terms removed.

    This is necessary because for SRCU, we will have to use the loc
    relation to check that the terms at the start and end of each disjunct
    in the definition of rcu-fence refer to the same srcu_struct
    location. If these terms are hidden behind po and po?, there's no way
    to carry out this check.

    Signed-off-by: Alan Stern
    Signed-off-by: Paul E. McKenney
    Tested-by: Andrea Parri

    Alan Stern
     
  • In preparation for adding support for SRCU, rename "crit" to
    "rcu-rscs", rename "rscs" to "rcu-rscsi", and remove the restriction
    to only the outermost level of nesting.

    The name change is needed for disambiguating RCU read-side critical
    sections from SRCU read-side critical sections. Adding the "i" at the
    end of "rcu-rscsi" emphasizes that the relation is inverted; it links
    rcu_read_unlock() events to their corresponding preceding
    rcu_read_lock() events.

    The restriction to outermost nesting levels was never essential; it
    was included mostly to show that it could be done. Rather than add
    equivalent unnecessary code for SRCU lock nesting, it seemed better to
    remove the existing code.

    Signed-off-by: Alan Stern
    Signed-off-by: Paul E. McKenney
    Tested-by: Andrea Parri

    Alan Stern
     

21 Jan, 2019

3 commits

  • The "--jobs" argument to the litmus-test scripts is similar to the "-jN"
    argument to "make", so this commit allows the "-jN" form as well. While
    in the area, it also prohibits the various forms of "-j0".

    Suggested-by: Alan Stern
    Signed-off-by: Paul E. McKenney
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: akiyks@gmail.com
    Cc: boqun.feng@gmail.com
    Cc: dhowells@redhat.com
    Cc: j.alglave@ucl.ac.uk
    Cc: linux-arch@vger.kernel.org
    Cc: luc.maranget@inria.fr
    Cc: npiggin@gmail.com
    Cc: parri.andrea@gmail.com
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/20181203230451.28921-3-paulmck@linux.ibm.com
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • The https://github.com/paulmckrcu/litmus repository contains a large
    number of C-language litmus tests that include "Result:" comments
    predicting the verification result. This commit adds a number of scripts
    that run tests on these litmus tests:

    checkghlitmus.sh:
    Runs all litmus tests in the https://github.com/paulmckrcu/litmus
    archive that are C-language and that have "Result:" comment lines
    documenting expected results, comparing the actual results to
    those expected. Clones the repository if it has not already
    been cloned into the "tools/memory-model/litmus" directory.

    initlitmushist.sh
    Run all litmus tests having no more than the specified number
    of processes given a specified timeout, recording the results in
    .litmus.out files. Clones the repository if it has not already
    been cloned into the "tools/memory-model/litmus" directory.

    newlitmushist.sh
    For all new or updated litmus tests having no more than the
    specified number of processes given a specified timeout, run
    and record the results in .litmus.out files.

    checklitmushist.sh
    Run all litmus tests having .litmus.out files from previous
    initlitmushist.sh or newlitmushist.sh runs, comparing the
    herd output to that of the original runs.

    The above scripts will run litmus tests concurrently, by default with
    one job per available CPU. Giving any of these scripts the --help
    argument will cause them to print usage information.

    This commit also adds a number of helper scripts that are not intended
    to be invoked from the command line:

    cmplitmushist.sh: Compare the output of two different runs of the same
    litmus test.

    judgelitmus.sh: Compare the output of a litmus test to its "Result:"
    comment line.

    parseargs.sh: Parse command-line arguments.

    runlitmushist.sh: Run the litmus tests whose pathnames are provided one
    per line on standard input.

    While in the area, this commit also makes the existing checklitmus.sh
    and checkalllitmus.sh scripts use parseargs.sh in order to provide a
    bit of uniformity. In addition, per-litmus-test status output is directed
    to stdout, while end-of-test summary information is directed to stderr.
    Finally, the error flag standardizes on "!!!" to assist those familiar
    with rcutorture output.

    The defaults for the parseargs.sh arguments may be overridden by using
    environment variables: LKMM_DESTDIR for --destdir, LKMM_HERD_OPTIONS
    for --herdoptions, LKMM_JOBS for --jobs, LKMM_PROCS for --procs, and
    LKMM_TIMEOUT for --timeout.

    [ paulmck: History-check summary-line changes per Alan Stern feedback. ]
    Signed-off-by: Paul E. McKenney
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: akiyks@gmail.com
    Cc: boqun.feng@gmail.com
    Cc: dhowells@redhat.com
    Cc: j.alglave@ucl.ac.uk
    Cc: linux-arch@vger.kernel.org
    Cc: luc.maranget@inria.fr
    Cc: npiggin@gmail.com
    Cc: parri.andrea@gmail.com
    Cc: stern@rowland.harvard.edu
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/20181203230451.28921-2-paulmck@linux.ibm.com
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • The kernel documents smp_mb__after_unlock_lock() the following way:

    "Place this after a lock-acquisition primitive to guarantee that
    an UNLOCK+LOCK pair acts as a full barrier. This guarantee applies
    if the UNLOCK and LOCK are executed by the same CPU or if the
    UNLOCK and LOCK operate on the same lock variable."

    Formalize in LKMM the above guarantee by defining (new) mb-links according
    to the law:

    ([M] ; po ; [UL] ; (co | po) ; [LKW] ;
    fencerel(After-unlock-lock) ; [M])

    where the component ([UL] ; co ; [LKW]) identifies "UNLOCK+LOCK pairs on
    the same lock variable" and the component ([UL] ; po ; [LKW]) identifies
    "UNLOCK+LOCK pairs executed by the same CPU".

    In particular, the LKMM forbids the following two behaviors (the second
    litmus test below is based on:

    Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html

    c.f., Section "Tree RCU Grace Period Memory Ordering Building Blocks"):

    C after-unlock-lock-same-cpu

    (*
    * Result: Never
    *)

    {}

    P0(spinlock_t *s, spinlock_t *t, int *x, int *y)
    {
    int r0;

    spin_lock(s);
    WRITE_ONCE(*x, 1);
    spin_unlock(s);
    spin_lock(t);
    smp_mb__after_unlock_lock();
    r0 = READ_ONCE(*y);
    spin_unlock(t);
    }

    P1(int *x, int *y)
    {
    int r0;

    WRITE_ONCE(*y, 1);
    smp_mb();
    r0 = READ_ONCE(*x);
    }

    exists (0:r0=0 /\ 1:r0=0)

    C after-unlock-lock-same-lock-variable

    (*
    * Result: Never
    *)

    {}

    P0(spinlock_t *s, int *x, int *y)
    {
    int r0;

    spin_lock(s);
    WRITE_ONCE(*x, 1);
    r0 = READ_ONCE(*y);
    spin_unlock(s);
    }

    P1(spinlock_t *s, int *y, int *z)
    {
    int r0;

    spin_lock(s);
    smp_mb__after_unlock_lock();
    WRITE_ONCE(*y, 1);
    r0 = READ_ONCE(*z);
    spin_unlock(s);
    }

    P2(int *z, int *x)
    {
    int r0;

    WRITE_ONCE(*z, 1);
    smp_mb();
    r0 = READ_ONCE(*x);
    }

    exists (0:r0=0 /\ 1:r0=0 /\ 2:r0=0)

    Signed-off-by: Andrea Parri
    Signed-off-by: Paul E. McKenney
    Cc: Akira Yokosawa
    Cc: Alan Stern
    Cc: Boqun Feng
    Cc: Daniel Lustig
    Cc: David Howells
    Cc: Jade Alglave
    Cc: Linus Torvalds
    Cc: Luc Maranget
    Cc: Nicholas Piggin
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: linux-arch@vger.kernel.org
    Cc: parri.andrea@gmail.com
    Link: http://lkml.kernel.org/r/20181203230451.28921-1-paulmck@linux.ibm.com
    Signed-off-by: Ingo Molnar

    Andrea Parri
     

02 Oct, 2018

4 commits

  • This commit adds more detail about compiler optimizations and
    not-yet-modeled Linux-kernel APIs.

    Signed-off-by: Paul E. McKenney
    Reviewed-by: Andrea Parri
    Cc: Alexander Shishkin
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    Cc: Vince Weaver
    Cc: akiyks@gmail.com
    Cc: boqun.feng@gmail.com
    Cc: dhowells@redhat.com
    Cc: j.alglave@ucl.ac.uk
    Cc: linux-arch@vger.kernel.org
    Cc: luc.maranget@inria.fr
    Cc: npiggin@gmail.com
    Cc: parri.andrea@gmail.com
    Cc: stern@rowland.harvard.edu
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/20180926182920.27644-4-paulmck@linux.ibm.com
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • This commit fixes a duplicate-"the" typo in README.

    Signed-off-by: SeongJae Park
    Signed-off-by: Paul E. McKenney
    Acked-by: Alan Stern
    Cc: Alexander Shishkin
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    Cc: Vince Weaver
    Cc: akiyks@gmail.com
    Cc: boqun.feng@gmail.com
    Cc: dhowells@redhat.com
    Cc: j.alglave@ucl.ac.uk
    Cc: linux-arch@vger.kernel.org
    Cc: luc.maranget@inria.fr
    Cc: npiggin@gmail.com
    Cc: parri.andrea@gmail.com
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/20180926182920.27644-3-paulmck@linux.ibm.com
    Signed-off-by: Ingo Molnar

    SeongJae Park
     
  • More than one kernel developer has expressed the opinion that the LKMM
    should enforce ordering of writes by locking. In other words, given
    the following code:

    WRITE_ONCE(x, 1);
    spin_unlock(&s):
    spin_lock(&s);
    WRITE_ONCE(y, 1);

    the stores to x and y should be propagated in order to all other CPUs,
    even though those other CPUs might not access the lock s. In terms of
    the memory model, this means expanding the cumul-fence relation.

    Locks should also provide read-read (and read-write) ordering in a
    similar way. Given:

    READ_ONCE(x);
    spin_unlock(&s);
    spin_lock(&s);
    READ_ONCE(y); // or WRITE_ONCE(y, 1);

    the load of x should be executed before the load of (or store to) y.
    The LKMM already provides this ordering, but it provides it even in
    the case where the two accesses are separated by a release/acquire
    pair of fences rather than unlock/lock. This would prevent
    architectures from using weakly ordered implementations of release and
    acquire, which seems like an unnecessary restriction. The patch
    therefore removes the ordering requirement from the LKMM for that
    case.

    There are several arguments both for and against this change. Let us
    refer to these enhanced ordering properties by saying that the LKMM
    would require locks to be RCtso (a bit of a misnomer, but analogous to
    RCpc and RCsc) and it would require ordinary acquire/release only to
    be RCpc. (Note: In the following, the phrase "all supported
    architectures" is meant not to include RISC-V. Although RISC-V is
    indeed supported by the kernel, the implementation is still somewhat
    in a state of flux and therefore statements about it would be
    premature.)

    Pros:

    The kernel already provides RCtso ordering for locks on all
    supported architectures, even though this is not stated
    explicitly anywhere. Therefore the LKMM should formalize it.

    In theory, guaranteeing RCtso ordering would reduce the need
    for additional barrier-like constructs meant to increase the
    ordering strength of locks.

    Will Deacon and Peter Zijlstra are strongly in favor of
    formalizing the RCtso requirement. Linus Torvalds and Will
    would like to go even further, requiring locks to have RCsc
    behavior (ordering preceding writes against later reads), but
    they recognize that this would incur a noticeable performance
    degradation on the POWER architecture. Linus also points out
    that people have made the mistake, in the past, of assuming
    that locking has stronger ordering properties than is
    currently guaranteed, and this change would reduce the
    likelihood of such mistakes.

    Not requiring ordinary acquire/release to be any stronger than
    RCpc may prove advantageous for future architectures, allowing
    them to implement smp_load_acquire() and smp_store_release()
    with more efficient machine instructions than would be
    possible if the operations had to be RCtso. Will and Linus
    approve this rationale, hypothetical though it is at the
    moment (it may end up affecting the RISC-V implementation).
    The same argument may or may not apply to RMW-acquire/release;
    see also the second Con entry below.

    Linus feels that locks should be easy for people to use
    without worrying about memory consistency issues, since they
    are so pervasive in the kernel, whereas acquire/release is
    much more of an "experts only" tool. Requiring locks to be
    RCtso is a step in this direction.

    Cons:

    Andrea Parri and Luc Maranget think that locks should have the
    same ordering properties as ordinary acquire/release (indeed,
    Luc points out that the names "acquire" and "release" derive
    from the usage of locks). Andrea points out that having
    different ordering properties for different forms of acquires
    and releases is not only unnecessary, it would also be
    confusing and unmaintainable.

    Locks are constructed from lower-level primitives, typically
    RMW-acquire (for locking) and ordinary release (for unlock).
    It is illogical to require stronger ordering properties from
    the high-level operations than from the low-level operations
    they comprise. Thus, this change would make

    while (cmpxchg_acquire(&s, 0, 1) != 0)
    cpu_relax();

    an incorrect implementation of spin_lock(&s) as far as the
    LKMM is concerned. In theory this weakness can be ameliorated
    by changing the LKMM even further, requiring
    RMW-acquire/release also to be RCtso (which it already is on
    all supported architectures).

    As far as I know, nobody has singled out any examples of code
    in the kernel that actually relies on locks being RCtso.
    (People mumble about RCU and the scheduler, but nobody has
    pointed to any actual code. If there are any real cases,
    their number is likely quite small.) If RCtso ordering is not
    needed, why require it?

    A handful of locking constructs (qspinlocks, qrwlocks, and
    mcs_spinlocks) are built on top of smp_cond_load_acquire()
    instead of an RMW-acquire instruction. It currently provides
    only the ordinary acquire semantics, not the stronger ordering
    this patch would require of locks. In theory this could be
    ameliorated by requiring smp_cond_load_acquire() in
    combination with ordinary release also to be RCtso (which is
    currently true on all supported architectures).

    On future weakly ordered architectures, people may be able to
    implement locks in a non-RCtso fashion with significant
    performance improvement. Meeting the RCtso requirement would
    necessarily add run-time overhead.

    Overall, the technical aspects of these arguments seem relatively
    minor, and it appears mostly to boil down to a matter of opinion.
    Since the opinions of senior kernel maintainers such as Linus,
    Peter, and Will carry more weight than those of Luc and Andrea, this
    patch changes the model in accordance with the maintainers' wishes.

    Signed-off-by: Alan Stern
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Will Deacon
    Reviewed-by: Andrea Parri
    Acked-by: Peter Zijlstra (Intel)
    Cc: Alexander Shishkin
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    Cc: Vince Weaver
    Cc: akiyks@gmail.com
    Cc: boqun.feng@gmail.com
    Cc: dhowells@redhat.com
    Cc: j.alglave@ucl.ac.uk
    Cc: linux-arch@vger.kernel.org
    Cc: luc.maranget@inria.fr
    Cc: npiggin@gmail.com
    Cc: parri.andrea@gmail.com
    Link: http://lkml.kernel.org/r/20180926182920.27644-2-paulmck@linux.ibm.com
    Signed-off-by: Ingo Molnar

    Alan Stern
     
  • This commit documents the scheme used to generate the names for the
    litmus tests.

    [ paulmck: Apply feedback from Andrea Parri and Will Deacon. ]
    Signed-off-by: Paul E. McKenney
    Acked-by: Will Deacon
    Cc: Alexander Shishkin
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    Cc: Vince Weaver
    Cc: akiyks@gmail.com
    Cc: boqun.feng@gmail.com
    Cc: dhowells@redhat.com
    Cc: j.alglave@ucl.ac.uk
    Cc: linux-arch@vger.kernel.org
    Cc: luc.maranget@inria.fr
    Cc: npiggin@gmail.com
    Cc: parri.andrea@gmail.com
    Cc: stern@rowland.harvard.edu
    Link: http://lkml.kernel.org/r/20180926182920.27644-1-paulmck@linux.ibm.com
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

17 Jul, 2018

7 commits

  • norm7 produces the 'normalized' name of a litmus test, when the test
    can be generated from a single cycle that passes through each process
    exactly once. The commit renames such tests in order to comply to the
    naming scheme implemented by this tool.

    Signed-off-by: Andrea Parri
    Signed-off-by: Paul E. McKenney
    Acked-by: Alan Stern
    Cc: Akira Yokosawa
    Cc: Boqun Feng
    Cc: David Howells
    Cc: Jade Alglave
    Cc: Linus Torvalds
    Cc: Luc Maranget
    Cc: Nicholas Piggin
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: linux-arch@vger.kernel.org
    Cc: parri.andrea@gmail.com
    Link: http://lkml.kernel.org/r/20180716180605.16115-14-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Andrea Parri
     
  • The tools/memory-model/Documentation/explanation.txt file says
    "For each other CPU C', smb_wmb() forces all po-earlier stores"
    This commit therefore replaces the "smb_wmb()" with "smp_wmb()".

    Signed-off-by: Yauheni Kaliuta
    Signed-off-by: Paul E. McKenney
    Acked-by: Alan Stern
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: akiyks@gmail.com
    Cc: boqun.feng@gmail.com
    Cc: dhowells@redhat.com
    Cc: j.alglave@ucl.ac.uk
    Cc: linux-arch@vger.kernel.org
    Cc: luc.maranget@inria.fr
    Cc: npiggin@gmail.com
    Cc: parri.andrea@gmail.com
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/20180716180605.16115-13-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Yauheni Kaliuta
     
  • This commit makes the scripts executable to avoid the need for everyone
    to do so manually in their archive.

    Signed-off-by: Paul E. McKenney
    Acked-by: Akira Yokosawa
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: boqun.feng@gmail.com
    Cc: dhowells@redhat.com
    Cc: j.alglave@ucl.ac.uk
    Cc: linux-arch@vger.kernel.org
    Cc: luc.maranget@inria.fr
    Cc: npiggin@gmail.com
    Cc: parri.andrea@gmail.com
    Cc: stern@rowland.harvard.edu
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/20180716180605.16115-7-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • Since commit:

    b899a850431e2dd0 ("compiler.h: Remove ACCESS_ONCE()")

    ... there has been no definition of ACCESS_ONCE() in the kernel tree,
    and it has been necessary to use READ_ONCE() or WRITE_ONCE() instead.

    Correspondingly, let's remove ACCESS_ONCE() from the kernel memory
    model.

    Signed-off-by: Mark Rutland
    Signed-off-by: Paul E. McKenney
    Acked-by: Andrea Parri
    Cc: Akira Yokosawa
    Cc: Alan Stern
    Cc: Boqun Feng
    Cc: David Howells
    Cc: Jade Alglave
    Cc: Linus Torvalds
    Cc: Luc Maranget
    Cc: Nicholas Piggin
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: linux-arch@vger.kernel.org
    Cc: parri.andrea@gmail.com
    Link: http://lkml.kernel.org/r/20180716180605.16115-6-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Mark Rutland
     
  • Since commit:

    b899a850431e2dd0 ("compiler.h: Remove ACCESS_ONCE()")

    ... there has been no definition of ACCESS_ONCE() in the kernel tree,
    and it has been necessary to use READ_ONCE() or WRITE_ONCE() instead.

    Let's update the exmaples in recipes.txt likewise for consistency, using
    READ_ONCE() for reads.

    Signed-off-by: Mark Rutland
    Signed-off-by: Paul E. McKenney
    Acked-by: Andrea Parri
    Cc: Akira Yokosawa
    Cc: Alan Stern
    Cc: Boqun Feng
    Cc: David Howells
    Cc: Jade Alglave
    Cc: Linus Torvalds
    Cc: Luc Maranget
    Cc: Nicholas Piggin
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: linux-arch@vger.kernel.org
    Cc: parri.andrea@gmail.com
    Link: http://lkml.kernel.org/r/20180716180605.16115-5-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Mark Rutland
     
  • The names on the first line of the litmus tests are arbitrary,
    but the convention is that they be the filename without the trailing
    ".litmus". This commit therefore removes the stray trailing ".litmus"
    from ISA2+pooncelock+pooncelock+pombonce.litmus's name.

    Reported-by: Andrea Parri
    Signed-off-by: Paul E. McKenney
    Acked-by: Alan Stern
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: akiyks@gmail.com
    Cc: boqun.feng@gmail.com
    Cc: dhowells@redhat.com
    Cc: j.alglave@ucl.ac.uk
    Cc: linux-arch@vger.kernel.org
    Cc: luc.maranget@inria.fr
    Cc: npiggin@gmail.com
    Cc: parri.andrea@gmail.com
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/20180716180605.16115-2-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • This commit adds a litmus test suggested by Alan Stern that is forbidden
    on fully multicopy atomic systems, but allowed on other-multicopy and
    on non-multicopy atomic systems. For reference, s390 is fully multicopy
    atomic, x86 and ARMv8 are other-multicopy atomic, and ARMv7 and powerpc
    are non-multicopy atomic.

    Suggested-by: Alan Stern
    Signed-off-by: Paul E. McKenney
    Acked-by: Alan Stern
    Acked-by: Andrea Parri
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: akiyks@gmail.com
    Cc: boqun.feng@gmail.com
    Cc: dhowells@redhat.com
    Cc: j.alglave@ucl.ac.uk
    Cc: linux-arch@vger.kernel.org
    Cc: luc.maranget@inria.fr
    Cc: npiggin@gmail.com
    Cc: parri.andrea@gmail.com
    Cc: will.deacon@arm.com
    Link: http://lkml.kernel.org/r/20180716180605.16115-1-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

15 May, 2018

6 commits

  • The paper discusses the revised ARMv8 memory model; such revision
    had an important impact on the design of the LKMM.

    Signed-off-by: Andrea Parri
    Signed-off-by: Paul E. McKenney
    Cc: Akira Yokosawa
    Cc: Alan Stern
    Cc: Andrew Morton
    Cc: Boqun Feng
    Cc: David Howells
    Cc: Jade Alglave
    Cc: Linus Torvalds
    Cc: Luc Maranget
    Cc: Nicholas Piggin
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: linux-arch@vger.kernel.org
    Cc: parri.andrea@gmail.com
    Link: http://lkml.kernel.org/r/1526340837-12222-19-git-send-email-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Andrea Parri
     
  • ASPLOS 2018 was held in March: make sure this is reflected in
    header comments and references.

    Signed-off-by: Andrea Parri
    Signed-off-by: Paul E. McKenney
    Cc: Akira Yokosawa
    Cc: Alan Stern
    Cc: Andrew Morton
    Cc: Boqun Feng
    Cc: David Howells
    Cc: Jade Alglave
    Cc: Linus Torvalds
    Cc: Luc Maranget
    Cc: Nicholas Piggin
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: linux-arch@vger.kernel.org
    Cc: parri.andrea@gmail.com
    Link: http://lkml.kernel.org/r/1526340837-12222-18-git-send-email-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Andrea Parri
     
  • This commit uses tabs for indentation and adds spaces around binary
    operator.

    Signed-off-by: Andrea Parri
    Signed-off-by: Paul E. McKenney
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: akiyks@gmail.com
    Cc: boqun.feng@gmail.com
    Cc: dhowells@redhat.com
    Cc: j.alglave@ucl.ac.uk
    Cc: linux-arch@vger.kernel.org
    Cc: luc.maranget@inria.fr
    Cc: npiggin@gmail.com
    Cc: parri.andrea@gmail.com
    Cc: stern@rowland.harvard.edu
    Link: http://lkml.kernel.org/r/1526340837-12222-16-git-send-email-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Andrea Parri
     
  • lock.cat contains old comments and code referring to the possibility
    of LKR events that are not part of an RMW pair. This is a holdover
    from when I though we might end up using LKR events to implement
    spin_is_locked(). Reword the comments to remove this assumption and
    replace domain(lk-rmw) in the code with LKR.

    Tested-by: Andrea Parri
    [ paulmck: Pulled as lock-nest into previous line as discussed. ]
    Signed-off-by: Alan Stern
    Signed-off-by: Paul E. McKenney
    Cc: Akira Yokosawa
    Cc: Andrew Morton
    Cc: Boqun Feng
    Cc: David Howells
    Cc: Jade Alglave
    Cc: Linus Torvalds
    Cc: Luc Maranget
    Cc: Nicholas Piggin
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: linux-arch@vger.kernel.org
    Cc: parri.andrea@gmail.com
    Link: http://lkml.kernel.org/r/1526340837-12222-15-git-send-email-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Alan Stern
     
  • The code in lock.cat which checks for normal read/write accesses to
    spinlock variables doesn't take into account the newly added RL and RU
    events. Add them into the test, and move the resulting code up near
    the start of the file, since a violation would indicate a pretty
    severe conceptual error in a litmus test.

    Tested-by: Andrea Parri
    Signed-off-by: Alan Stern
    Signed-off-by: Paul E. McKenney
    Cc: Akira Yokosawa
    Cc: Andrew Morton
    Cc: Boqun Feng
    Cc: David Howells
    Cc: Jade Alglave
    Cc: Linus Torvalds
    Cc: Luc Maranget
    Cc: Nicholas Piggin
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: linux-arch@vger.kernel.org
    Cc: parri.andrea@gmail.com
    Link: http://lkml.kernel.org/r/1526340837-12222-14-git-send-email-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Alan Stern
     
  • This patch improves the comments in tools/memory-model/lock.cat. In
    addition to making the text more uniform and removing redundant
    comments, it adds a description of all the possible locking events
    that herd can generate.

    Tested-by: Andrea Parri
    Signed-off-by: Alan Stern
    Signed-off-by: Paul E. McKenney
    Cc: Akira Yokosawa
    Cc: Andrew Morton
    Cc: Boqun Feng
    Cc: David Howells
    Cc: Jade Alglave
    Cc: Linus Torvalds
    Cc: Luc Maranget
    Cc: Nicholas Piggin
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: linux-arch@vger.kernel.org
    Cc: parri.andrea@gmail.com
    Link: http://lkml.kernel.org/r/1526340837-12222-13-git-send-email-paulmck@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Alan Stern