12 Dec, 2011

7 commits

  • The current rcu_batch_end event trace records only the name of the RCU
    flavor and the total number of callbacks that remain queued on the
    current CPU. This is insufficient for testing and tuning the new
    dyntick-idle RCU_FAST_NO_HZ code, so this commit adds idle state along
    with whether or not any of the callbacks that were ready to invoke
    at the beginning of rcu_do_batch() are still queued.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Change from direct comparison of ->pid with zero to is_idle_task().

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • RCU has traditionally relied on idle_cpu() to determine whether a given
    CPU is running in the context of an idle task, but commit 908a3283
    (Fix idle_cpu()) has invalidated this approach. After commit 908a3283,
    idle_cpu() will return true if the current CPU is currently running the
    idle task, and will be doing so for the foreseeable future. RCU instead
    needs to know whether or not the current CPU is currently running the
    idle task, regardless of what the near future might bring.

    This commit therefore switches from idle_cpu() to "current->pid != 0".

    Reported-by: Wu Fengguang
    Suggested-by: Carsten Emde
    Signed-off-by: Paul E. McKenney
    Acked-by: Steven Rostedt
    Tested-by: Wu Fengguang
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The current code just complains if the current task is not the idle task.
    This commit therefore adds printing of the identity of the idle task.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • The trace_rcu_dyntick() trace event did not print both the old and
    the new value of the nesting level, and furthermore printed only
    the low-order 32 bits of it. This could result in some confusion
    when interpreting trace-event dumps, so this commit prints both
    the old and the new value, prints the full 64 bits, and also selects
    the process-entry/exit increment to print nicely in hexadecimal.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • Report that none of the rcu read lock maps are held while in an RCU
    extended quiescent state (the section between rcu_idle_enter()
    and rcu_idle_exit()). This helps detect any use of rcu_dereference()
    and friends from within the section in idle where RCU is not allowed.

    This way we can guarantee an extended quiescent window where the CPU
    can be put in dyntick idle mode or can simply aoid to be part of any
    global grace period completion while in the idle loop.

    Uses of RCU from such mode are totally ignored by RCU, hence the
    importance of these checks.

    Signed-off-by: Frederic Weisbecker
    Cc: Paul E. McKenney
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Lai Jiangshan
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Frederic Weisbecker
     
  • Earlier versions of RCU used the scheduling-clock tick to detect idleness
    by checking for the idle task, but handled idleness differently for
    CONFIG_NO_HZ=y. But there are now a number of uses of RCU read-side
    critical sections in the idle task, for example, for tracing. A more
    fine-grained detection of idleness is therefore required.

    This commit presses the old dyntick-idle code into full-time service,
    so that rcu_idle_enter(), previously known as rcu_enter_nohz(), is
    always invoked at the beginning of an idle loop iteration. Similarly,
    rcu_idle_exit(), previously known as rcu_exit_nohz(), is always invoked
    at the end of an idle-loop iteration. This allows the idle task to
    use RCU everywhere except between consecutive rcu_idle_enter() and
    rcu_idle_exit() calls, in turn allowing architecture maintainers to
    specify exactly where in the idle loop that RCU may be used.

    Because some of the userspace upcall uses can result in what looks
    to RCU like half of an interrupt, it is not possible to expect that
    the irq_enter() and irq_exit() hooks will give exact counts. This
    patch therefore expands the ->dynticks_nesting counter to 64 bits
    and uses two separate bitfields to count process/idle transitions
    and interrupt entry/exit transitions. It is presumed that userspace
    upcalls do not happen in the idle loop or from usermode execution
    (though usermode might do a system call that results in an upcall).
    The counter is hard-reset on each process/idle transition, which
    avoids the interrupt entry/exit error from accumulating. Overflow
    is avoided by the 64-bitness of the ->dyntick_nesting counter.

    This commit also adds warnings if a non-idle task asks RCU to enter
    idle state (and these checks will need some adjustment before applying
    Frederic's OS-jitter patches (http://lkml.org/lkml/2011/10/7/246).
    In addition, validation of ->dynticks and ->dynticks_nesting is added.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     

31 Oct, 2011

2 commits

  • The file rcutiny.c does not need moduleparam.h header, as
    there are no modparams in this file.

    However rcutiny_plugin.h does define a module_init() and
    a module_exit() and it uses the various MODULE_ macros, so
    it really does need module.h included.

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     
  • The changed files were only including linux/module.h for the
    EXPORT_SYMBOL infrastructure, and nothing else. Revector them
    onto the isolated export header for faster compile times.

    Nothing to see here but a whole lot of instances of:

    -#include
    +#include

    This commit is only changing the kernel dir; next targets
    will probably be mm, fs, the arch dirs, etc.

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     

29 Sep, 2011

6 commits

  • Add trace events to record grace-period start and end, quiescent states,
    CPUs noticing grace-period start and end, grace-period initialization,
    call_rcu() invocation, tasks blocking in RCU read-side critical sections,
    tasks exiting those same critical sections, force_quiescent_state()
    detection of dyntick-idle and offline CPUs, CPUs entering and leaving
    dyntick-idle mode (except from NMIs), CPUs coming online and going
    offline, and CPUs being kicked for staying in dyntick-idle mode for too
    long (as in many weeks, even on 32-bit systems).

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    rcu: Add the rcu flavor to callback trace events

    The earlier trace events for registering RCU callbacks and for invoking
    them did not include the RCU flavor (rcu_bh, rcu_preempt, or rcu_sched).
    This commit adds the RCU flavor to those trace events.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This patch #ifdefs TINY_RCU kthreads out of the kernel unless RCU_BOOST=y,
    thus eliminating context-switch overhead if RCU priority boosting has
    not been configured.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Add a string to the rcu_batch_start() and rcu_batch_end() trace
    messages that indicates the RCU type ("rcu_sched", "rcu_bh", or
    "rcu_preempt"). The trace messages for the actual invocations
    themselves are not marked, as it should be clear from the
    rcu_batch_start() and rcu_batch_end() events before and after.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • In order to allow event tracing to distinguish between flavors of
    RCU, we need those names in the relevant RCU data structures. TINY_RCU
    has avoided them for memory-footprint reasons, so add them only if
    CONFIG_RCU_TRACE=y.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • There was recently some controversy about the overhead of invoking RCU
    callbacks. Add TRACE_EVENT()s to obtain fine-grained timings for the
    start and stop of a batch of callbacks and also for each callback invoked.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Pull the code that waits for an RCU grace period into a single function,
    which is then called by synchronize_rcu() and friends in the case of
    TREE_RCU and TREE_PREEMPT_RCU, and from rcu_barrier() and friends in
    the case of TINY_RCU and TINY_PREEMPT_RCU.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

21 May, 2011

1 commit

  • Commit e66eed651fd1 ("list: remove prefetching from regular list
    iterators") removed the include of prefetch.h from list.h, which
    uncovered several cases that had apparently relied on that rather
    obscure header file dependency.

    So this fixes things up a bit, using

    grep -L linux/prefetch.h $(git grep -l '[^a-z_]prefetchw*(' -- '*.[ch]')
    grep -L 'prefetchw*(' $(git grep -l 'linux/prefetch.h' -- '*.[ch]')

    to guide us in finding files that either need
    inclusion, or have it despite not needing it.

    There are more of them around (mostly network drivers), but this gets
    many core ones.

    Reported-by: Stephen Rothwell
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

06 May, 2011

2 commits

  • rcu_sched_qs() currently calls local_irq_save()/local_irq_restore() up
    to three times.

    Remove irq masking from rcu_qsctr_help() / invoke_rcu_kthread()
    and do it once in rcu_sched_qs() / rcu_bh_qs()

    This generates smaller code as well.

    text data bss dec hex filename
    2314 156 24 2494 9be kernel/rcutiny.old.o
    2250 156 24 2430 97e kernel/rcutiny.new.o

    Fix an outdated comment for rcu_qsctr_help()
    Move invoke_rcu_kthread() definition before its use.

    Signed-off-by: Eric Dumazet
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Eric Dumazet
     
  • Many rcu callbacks functions just call kfree() on the base structure.
    These functions are trivial, but their size adds up, and furthermore
    when they are used in a kernel module, that module must invoke the
    high-latency rcu_barrier() function at module-unload time.

    The kfree_rcu() function introduced by this commit addresses this issue.
    Rather than encoding a function address in the embedded rcu_head
    structure, kfree_rcu() instead encodes the offset of the rcu_head
    structure within the base structure. Because the functions are not
    allowed in the low-order 4096 bytes of kernel virtual memory, offsets
    up to 4095 bytes can be accommodated. If the offset is larger than
    4095 bytes, a compile-time error will be generated in __kfree_rcu().
    If this error is triggered, you can either fall back to use of call_rcu()
    or rearrange the structure to position the rcu_head structure into the
    first 4096 bytes.

    Note that the allowable offset might decrease in the future, for example,
    to allow something like kmem_cache_free_rcu().

    The new kfree_rcu() function can replace code as follows:

    call_rcu(&p->rcu, simple_kfree_callback);

    where "simple_kfree_callback()" might be defined as follows:

    void simple_kfree_callback(struct rcu_head *p)
    {
    struct foo *q = container_of(p, struct foo, rcu);

    kfree(q);
    }

    with the following:

    kfree_rcu(&p->rcu, rcu);

    Note that the "rcu" is the name of a field in the structure being
    freed. The reason for using this rather than passing in a pointer
    to the base structure is that the above approach allows better type
    checking.

    This commit is based on earlier work by Lai Jiangshan and Manfred Spraul:

    Lai's V1 patch: http://lkml.org/lkml/2008/9/18/1
    Manfred's patch: http://lkml.org/lkml/2009/1/2/115

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Manfred Spraul
    Signed-off-by: Paul E. McKenney
    Reviewed-by: David Howells
    Reviewed-by: Josh Triplett

    Lai Jiangshan
     

14 Jan, 2011

1 commit

  • If the RCU callback-processing kthread has nothing to do, it parks in
    a wait_event(). If RCU remains idle for more than two minutes, the
    kernel complains about this. This commit changes from wait_event()
    to wait_event_interruptible() to prevent the kernel from complaining
    just because RCU is idle.

    Reported-by: Russell King
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Tested-by: Thomas Weber
    Tested-by: Russell King

    Paul E. McKenney
     

30 Nov, 2010

2 commits

  • Add tracing for the tiny RCU implementations, including statistics on
    boosting in the case of TINY_PREEMPT_RCU and RCU_BOOST.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Add priority boosting, but only for TINY_PREEMPT_RCU. This is enabled
    by the default-off RCU_BOOST kernel parameter. The priority to which to
    boost preempted RCU readers is controlled by the RCU_BOOST_PRIO kernel
    parameter (defaulting to real-time priority 1) and the time to wait
    before boosting the readers blocking a given grace period is controlled
    by the RCU_BOOST_DELAY kernel parameter (defaulting to 500 milliseconds).

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

18 Nov, 2010

1 commit

  • If RCU priority boosting is to be meaningful, callback invocation must
    be boosted in addition to preempted RCU readers. Otherwise, in presence
    of CPU real-time threads, the grace period ends, but the callbacks don't
    get invoked. If the callbacks don't get invoked, the associated memory
    doesn't get freed, so the system is still subject to OOM.

    But it is not reasonable to priority-boost RCU_SOFTIRQ, so this commit
    moves the callback invocations to a kthread, which can be boosted easily.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

20 Aug, 2010

1 commit

  • Implement a small-memory-footprint uniprocessor-only implementation of
    preemptible RCU. This implementation uses but a single blocked-tasks
    list rather than the combinatorial number used per leaf rcu_node by
    TREE_PREEMPT_RCU, which reduces memory consumption and greatly simplifies
    processing. This version also takes advantage of uniprocessor execution
    to accelerate grace periods in the case where there are no readers.

    The general design is otherwise broadly similar to that of TREE_PREEMPT_RCU.

    This implementation is a step towards having RCU implementation driven
    off of the SMP and PREEMPT kernel configuration variables, which can
    happen once this implementation has accumulated sufficient experience.

    Removed ACCESS_ONCE() from __rcu_read_unlock() and added barrier() as
    suggested by Steve Rostedt in order to avoid the compiler-reordering
    issue noted by Mathieu Desnoyers (http://lkml.org/lkml/2010/8/16/183).

    As can be seen below, CONFIG_TINY_PREEMPT_RCU represents almost 5Kbyte
    savings compared to CONFIG_TREE_PREEMPT_RCU. Of course, for non-real-time
    workloads, CONFIG_TINY_RCU is even better.

    CONFIG_TREE_PREEMPT_RCU

    text data bss dec filename
    13 0 0 13 kernel/rcupdate.o
    6170 825 28 7023 kernel/rcutree.o
    ----
    7026 Total

    CONFIG_TINY_PREEMPT_RCU

    text data bss dec filename
    13 0 0 13 kernel/rcupdate.o
    2081 81 8 2170 kernel/rcutiny.o
    ----
    2183 Total

    CONFIG_TINY_RCU (non-preemptible)

    text data bss dec filename
    13 0 0 13 kernel/rcupdate.o
    719 25 0 744 kernel/rcutiny.o
    ---
    757 Total

    Requested-by: Loïc Minier
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

15 Jun, 2010

1 commit

  • Helps finding racy users of call_rcu(), which results in hangs because list
    entries are overwritten and/or skipped.

    Changelog since v4:
    - Bissectability is now OK
    - Now generate a WARN_ON_ONCE() for non-initialized rcu_head passed to
    call_rcu(). Statically initialized objects are detected with
    object_is_static().
    - Rename rcu_head_init_on_stack to init_rcu_head_on_stack.
    - Remove init_rcu_head() completely.

    Changelog since v3:
    - Include comments from Lai Jiangshan

    This new patch version is based on the debugobjects with the newly introduced
    "active state" tracker.

    Non-initialized entries are all considered as "statically initialized". An
    activation fixup (triggered by call_rcu()) takes care of performing the debug
    object initialization without issuing any warning. Since we cannot increase the
    size of struct rcu_head, I don't see much room to put an identifier for
    statically initialized rcu_head structures. So for now, we have to live without
    "activation without explicit init" detection. But the main purpose of this debug
    option is to detect double-activations (double call_rcu() use of a rcu_head
    before the callback is executed), which is correctly addressed here.

    This also detects potential internal RCU callback corruption, which would cause
    the callbacks to be executed twice.

    Signed-off-by: Mathieu Desnoyers
    CC: David S. Miller
    CC: "Paul E. McKenney"
    CC: akpm@linux-foundation.org
    CC: mingo@elte.hu
    CC: laijs@cn.fujitsu.com
    CC: dipankar@in.ibm.com
    CC: josh@joshtriplett.org
    CC: dvhltc@us.ibm.com
    CC: niv@us.ibm.com
    CC: tglx@linutronix.de
    CC: peterz@infradead.org
    CC: rostedt@goodmis.org
    CC: Valdis.Kletnieks@vt.edu
    CC: dhowells@redhat.com
    CC: eric.dumazet@gmail.com
    CC: Alexey Dobriyan
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Lai Jiangshan

    Mathieu Desnoyers
     

12 May, 2010

1 commit


11 May, 2010

3 commits


23 Nov, 2009

1 commit

  • The functions rcu_init() is a wrapper for __rcu_init(), and also
    sets up the CPU-hotplug notifier for rcu_barrier_cpu_hotplug().
    But TINY_RCU doesn't need CPU-hotplug notification, and the
    rcu_barrier_cpu_hotplug() is a simple wrapper for
    rcu_cpu_notify().

    So push rcu_init() out to kernel/rcutree.c and kernel/rcutiny.c
    and get rid of the wrapper function rcu_barrier_cpu_hotplug().

    Signed-off-by: Paul E. McKenney
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

26 Oct, 2009

2 commits

  • No change in functionality - just straighten out a few small
    stylistic details.

    Cc: Paul E. McKenney
    Cc: David Howells
    Cc: Josh Triplett
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: avi@redhat.com
    Cc: mtosatti@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • This patch is a version of RCU designed for !SMP provided for a
    small-footprint RCU implementation. In particular, the
    implementation of synchronize_rcu() is extremely lightweight and
    high performance. It passes rcutorture testing in each of the
    four relevant configurations (combinations of NO_HZ and PREEMPT)
    on x86. This saves about 1K bytes compared to old Classic RCU
    (which is no longer in mainline), and more than three kilobytes
    compared to Hierarchical RCU (updated to 2.6.30):

    CONFIG_TREE_RCU:

    text data bss dec filename
    183 4 0 187 kernel/rcupdate.o
    2783 520 36 3339 kernel/rcutree.o
    3526 Total (vs 4565 for v7)

    CONFIG_TREE_PREEMPT_RCU:

    text data bss dec filename
    263 4 0 267 kernel/rcupdate.o
    4594 776 52 5422 kernel/rcutree.o
    5689 Total (6155 for v7)

    CONFIG_TINY_RCU:

    text data bss dec filename
    96 4 0 100 kernel/rcupdate.o
    734 24 0 758 kernel/rcutiny.o
    858 Total (vs 848 for v7)

    The above is for x86. Your mileage may vary on other platforms.
    Further compression is possible, but is being procrastinated.

    Changes from v7 (http://lkml.org/lkml/2009/10/9/388)

    o Apply Lai Jiangshan's review comments (aside from
    might_sleep() in synchronize_sched(), which is covered by SMP builds).

    o Fix up expedited primitives.

    Changes from v6 (http://lkml.org/lkml/2009/9/23/293).

    o Forward ported to put it into the 2.6.33 stream.

    o Added lockdep support.

    o Make lightweight rcu_barrier.

    Changes from v5 (http://lkml.org/lkml/2009/6/23/12).

    o Ported to latest pre-2.6.32 merge window kernel.

    - Renamed rcu_qsctr_inc() to rcu_sched_qs().
    - Renamed rcu_bh_qsctr_inc() to rcu_bh_qs().
    - Provided trivial rcu_cpu_notify().
    - Provided trivial exit_rcu().
    - Provided trivial rcu_needs_cpu().
    - Fixed up the rcu_*_enter/exit() functions in linux/hardirq.h.

    o Removed the dependence on EMBEDDED, with a view to making
    TINY_RCU default for !SMP at some time in the future.

    o Added (trivial) support for expedited grace periods.

    Changes from v4 (http://lkml.org/lkml/2009/5/2/91) include:

    o Squeeze the size down a bit further by removing the
    ->completed field from struct rcu_ctrlblk.

    o This permits synchronize_rcu() to become the empty function.
    Previous concerns about rcutorture were unfounded, as
    rcutorture correctly handles a constant value from
    rcu_batches_completed() and rcu_batches_completed_bh().

    Changes from v3 (http://lkml.org/lkml/2009/3/29/221) include:

    o Changed rcu_batches_completed(), rcu_batches_completed_bh()
    rcu_enter_nohz(), rcu_exit_nohz(), rcu_nmi_enter(), and
    rcu_nmi_exit(), to be static inlines, as suggested by David
    Howells. Doing this saves about 100 bytes from rcutiny.o.
    (The numbers between v3 and this v4 of the patch are not directly
    comparable, since they are against different versions of Linux.)

    Changes from v2 (http://lkml.org/lkml/2009/2/3/333) include:

    o Fix whitespace issues.

    o Change short-circuit "||" operator to instead be "+" in order
    to fix performance bug noted by "kraai" on LWN.

    (http://lwn.net/Articles/324348/)

    Changes from v1 (http://lkml.org/lkml/2009/1/13/440) include:

    o This version depends on EMBEDDED as well as !SMP, as suggested
    by Ingo.

    o Updated rcu_needs_cpu() to unconditionally return zero,
    permitting the CPU to enter dynticks-idle mode at any time.
    This works because callbacks can be invoked upon entry to
    dynticks-idle mode.

    o Paul is now OK with this being included, based on a poll at
    the Kernel Miniconf at linux.conf.au, where about ten people said
    that they cared about saving 900 bytes on single-CPU systems.

    o Applies to both mainline and tip/core/rcu.

    Signed-off-by: Paul E. McKenney
    Acked-by: David Howells
    Acked-by: Josh Triplett
    Reviewed-by: Lai Jiangshan
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: avi@redhat.com
    Cc: mtosatti@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney