24 Sep, 2013

1 commit

  • The list_splice_init_rcu() function allows a list visible to RCU readers
    to be spliced into another list visible to RCU readers. This is OK,
    except for the use of INIT_LIST_HEAD(), which does pointer updates
    without doing anything to make those updates safe for concurrent readers.

    Of course, most of the time INIT_LIST_HEAD() is being used in reader-free
    contexts, such as initialization or cleanup, so it is OK for it to update
    pointers in an unsafe-for-RCU-readers manner. This commit therefore
    creates an INIT_LIST_HEAD_RCU() that uses ACCESS_ONCE() to make the updates
    reader-safe. The reason that we can use ACCESS_ONCE() instead of the more
    typical rcu_assign_pointer() is that list_splice_init_rcu() is updating the
    pointers to reference something that is already visible to readers, so
    that there is no problem with pre-initialized values.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

19 Aug, 2013

1 commit

  • list_first_or_null() should test whether the list is empty and return
    pointer to the first entry if not in a RCU safe manner. It's broken
    in several ways.

    * It compares __kernel @__ptr with __rcu @__next triggering the
    following sparse warning.

    net/core/dev.c:4331:17: error: incompatible types in comparison expression (different address spaces)

    * It doesn't perform rcu_dereference*() and computes the entry address
    using container_of() directly from the __rcu pointer which is
    inconsitent with other rculist interface. As a result, all three
    in-kernel users - net/core/dev.c, macvlan, cgroup - are buggy. They
    dereference the pointer w/o going through read barrier.

    * While ->next dereference passes through list_next_rcu(), the
    compiler is still free to fetch ->next more than once and thus
    nullify the "__ptr != __next" condition check.

    Fix it by making list_first_or_null_rcu() dereference ->next directly
    using ACCESS_ONCE() and then use list_entry_rcu() on it like other
    rculist accessors.

    v2: Paul pointed out that the compiler may fetch the pointer more than
    once nullifying the condition check. ACCESS_ONCE() added on
    ->next dereference.

    v3: Restored () around macro param which was accidentally removed.
    Spotted by Paul.

    Signed-off-by: Tejun Heo
    Reported-by: Fengguang Wu
    Cc: Dipankar Sarma
    Cc: "Paul E. McKenney"
    Cc: "David S. Miller"
    Cc: Li Zefan
    Cc: Patrick McHardy
    Cc: stable@vger.kernel.org
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Tejun Heo
     

29 May, 2013

1 commit

  • As rcu_dereference_raw() under RCU debug config options can add quite a
    bit of checks, and that tracing uses rcu_dereference_raw(), these checks
    happen with the function tracer. The function tracer also happens to trace
    these debug checks too. This added overhead can livelock the system.

    Add a new interface to RCU for both rcu_dereference_raw_notrace() as well
    as hlist_for_each_entry_rcu_notrace() as the hlist iterator uses the
    rcu_dereference_raw() as well, and is used a bit with the function tracer.

    Link: http://lkml.kernel.org/r/20130528184209.304356745@goodmis.org

    Acked-by: Paul E. McKenney
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

28 Feb, 2013

1 commit

  • I'm not sure why, but the hlist for each entry iterators were conceived

    list_for_each_entry(pos, head, member)

    The hlist ones were greedy and wanted an extra parameter:

    hlist_for_each_entry(tpos, pos, head, member)

    Why did they need an extra pos parameter? I'm not quite sure. Not only
    they don't really need it, it also prevents the iterator from looking
    exactly like the list iterator, which is unfortunate.

    Besides the semantic patch, there was some manual work required:

    - Fix up the actual hlist iterators in linux/list.h
    - Fix up the declaration of other iterators based on the hlist ones.
    - A very small amount of places were using the 'node' parameter, this
    was modified to use 'obj->member' instead.
    - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
    properly, so those had to be fixed up manually.

    The semantic patch which is mostly the work of Peter Senna Tschudin is here:

    @@
    iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

    type T;
    expression a,c,d,e;
    identifier b;
    statement S;
    @@

    -T b;

    [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
    [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
    [akpm@linux-foundation.org: checkpatch fixes]
    [akpm@linux-foundation.org: fix warnings]
    [akpm@linux-foudnation.org: redo intrusive kvm changes]
    Tested-by: Peter Senna Tschudin
    Acked-by: Paul E. McKenney
    Signed-off-by: Sasha Levin
    Cc: Wu Fengguang
    Cc: Marcelo Tosatti
    Cc: Gleb Natapov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sasha Levin
     

14 Nov, 2012

1 commit


25 Apr, 2012

2 commits

  • The list_first_entry_rcu() macro is inherently unsafe because it cannot
    be applied to an empty list. But because RCU readers do not exclude
    updaters, a list might become empty between the time that list_empty()
    claimed it was non-empty and the time that list_first_entry_rcu() is
    invoked. Therefore, the list_empty() test cannot be separated from the
    list_first_entry_rcu() call. This commit therefore combines these to
    macros to create a new list_first_or_null_rcu() macro that replaces
    the old (and unsafe) list_first_entry_rcu() macro.

    This patch incorporates Paul's review comments on the previous version of
    this patch available here:

    https://lkml.org/lkml/2012/4/2/536

    This patch cannot break any upstream code because list_first_entry_rcu()
    is not being used anywhere in the kernel (tested with grep(1)), and any
    external code using it is probably broken as a result of using it.

    Signed-off-by: Michel Machado
    CC: "Paul E. McKenney"
    CC: Dipankar Sarma
    Signed-off-by: Paul E. McKenney

    Michel Machado
     
  • * Make __list_add_rcu check the next->prev and prev->next pointers
    just like __list_add does.
    * Make list_del_rcu use __list_del_entry, which does the same checking
    at deletion time.

    Has been running for a week here without anything being tripped up,
    but it seems worth adding for completeness just in case something
    ever does corrupt those lists.

    Signed-off-by: Dave Jones
    Signed-off-by: Paul E. McKenney

    Dave Jones
     

21 Jul, 2011

1 commit

  • If the list to be spliced is empty, then list_splice_init_rcu() has
    nothing to do. Unfortunately, list_splice_init_rcu() does not check
    the list to be spliced; it instead checks the list to be spliced into.
    This results in memory leaks given current usage. This commit
    therefore fixes the empty-list check.

    Signed-off-by: Jan H. Schönherr
    Signed-off-by: Paul E. McKenney

    Jan H. Schönherr
     

20 May, 2011

2 commits

  • This is removes the use of software prefetching from the regular list
    iterators. We don't want it. If you do want to prefetch in some
    iterator of yours, go right ahead. Just don't expect the iterator to do
    it, since normally the downsides are bigger than the upsides.

    It also replaces with , because the
    use of LIST_POISON ends up needing it. is sadly not
    self-contained, and including prefetch.h just happened to hide that.

    Suggested by David Miller (networking has a lot of regular lists that
    are often empty or a single entry, and prefetching is not going to do
    anything but add useless instructions).

    Acked-by: Ingo Molnar
    Acked-by: David S. Miller
    Cc: linux-arch@vger.kernel.org
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • They not only increase the code footprint, they actually make things
    slower rather than faster. On internationally acclaimed benchmarks
    ("make -j16" on an already fully built kernel source tree) the hlist
    prefetching slows down the build by up to 1%.

    (Almost all of it comes from hlist_for_each_entry_rcu() as used by
    avc_has_perm_noaudit(), which is very hot due to all the pathname
    lookups to see if there is anything to do).

    The cause seems to be two-fold:

    - on at least some Intel cores, prefetch(NULL) ends up with some
    microarchitectural stall due to the TLB miss that it incurs. The
    hlist case triggers this very commonly, since the NULL pointer is the
    last entry in the list.

    - the prefetch appears to cause more D$ activity, probably because it
    prefetches hash list entries that are never actually used (because we
    ended the search early due to a hit).

    Regardless, the numbers clearly say that the implicit prefetching is
    simply a bad idea. If some _particular_ user of the hlist iterators
    wants to prefetch the next list entry, they can do so themselves
    explicitly, rather than depend on all list iterators doing so
    implicitly.

    Acked-by: Ingo Molnar
    Acked-by: David S. Miller
    Cc: linux-arch@vger.kernel.org
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

18 Dec, 2010

2 commits


21 Aug, 2010

1 commit


20 Aug, 2010

1 commit

  • This avoids warnings from missing __rcu annotations
    in the rculist implementation, making it possible to
    use the same lists in both RCU and non-RCU cases.

    We can add rculist annotations later, together with
    lockdep support for rculist, which is missing as well,
    but that may involve changing all the users.

    Signed-off-by: Arnd Bergmann
    Signed-off-by: Paul E. McKenney
    Cc: Pavel Emelyanov
    Cc: Sukadev Bhattiprolu
    Reviewed-by: Josh Triplett

    Arnd Bergmann
     

04 May, 2010

1 commit

  • Add hlist_for_each_entry_rcu_bh() and
    hlist_for_each_entry_continue_rcu_bh() macros, and use them in
    ipv6_get_ifaddr(), if6_get_first() and if6_get_next() to fix lockdeps
    warnings.

    Signed-off-by: Eric Dumazet
    Reviewed-by: "Paul E. McKenney"
    Signed-off-by: David S. Miller

    Eric Dumazet
     

21 Mar, 2010

1 commit


01 Mar, 2010

1 commit


25 Feb, 2010

1 commit

  • The theory is that use of bare rcu_dereference() is more prone
    to error than use of the RCU list-traversal primitives.
    Therefore, disable lockdep RCU read-side critical-section
    checking in these primitives for the time being. Once all of
    the rcu_dereference() uses have been dealt with, it may be time
    to re-enable lockdep checking for the RCU list-traversal
    primitives.

    Signed-off-by: Paul E. McKenney
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

23 Feb, 2010

1 commit


11 Nov, 2009

1 commit


15 Apr, 2009

1 commit

  • I've run into the situation where I need to use list_first_entry with
    rcu-guarded list. This patch introduces this.

    Also simplify list_for_each_entry_rcu() to use new list_entry_rcu()
    instead of list_entry().

    Signed-off-by: Jiri Pirko
    Reviewed-by: Paul E. McKenney
    Cc: dipankar@in.ibm.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Jiri Pirko
     

15 Aug, 2008

1 commit


29 Jul, 2008

1 commit

  • Introduce list_del_init_rcu() and document it.

    Signed-off-by: Andrea Arcangeli
    Acked-by: Linus Torvalds
    Cc: "Paul E. McKenney"
    Cc: Ingo Molnar
    Cc: Christoph Lameter
    Cc: Jack Steiner
    Cc: Robin Holt
    Cc: Nick Piggin
    Cc: Peter Zijlstra
    Cc: Kanoj Sarcar
    Cc: Roland Dreier
    Cc: Steve Wise
    Cc: Avi Kivity
    Cc: Hugh Dickins
    Cc: Rusty Russell
    Cc: Anthony Liguori
    Cc: Chris Wright
    Cc: Marcelo Tosatti
    Cc: Eric Dumazet
    Cc: "Paul E. McKenney"
    Cc: Izik Eidus
    Cc: Anthony Liguori
    Cc: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     

19 May, 2008

3 commits

  • RCU list iterators: should prefetch ever be optimised out with no
    side-effects, the current version will lose the barrier completely.

    Pointed-out-by: Linus Torvalds
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • Make almost all list mutation primitives use rcu_assign_pointer().

    The main point of this being readability improvement.

    Signed-off-by: Franck Bui-Huu
    Cc: "Paul E. McKenney"
    Cc: Josh Triplett
    Signed-off-by: Andrew Morton
    Signed-off-by: Ingo Molnar

    Franck Bui-Huu
     
  • Move rcu-protected lists from list.h into a new header file rculist.h.

    This is done because list are a very used primitive structure all over the
    kernel and it's currently impossible to include other header files in this
    list.h without creating some circular dependencies.

    For example, list.h implements rcu-protected list and uses rcu_dereference()
    without including rcupdate.h. It actually compiles because users of
    rcu_dereference() are macros. Others RCU functions could be used too but
    aren't probably because of this.

    Therefore this patch creates rculist.h which includes rcupdates without to
    many changes/troubles.

    Signed-off-by: Franck Bui-Huu
    Acked-by: Paul E. McKenney
    Acked-by: Josh Triplett
    Signed-off-by: Andrew Morton
    Signed-off-by: Ingo Molnar

    Franck Bui-Huu