30 Jul, 2013

1 commit

  • All the RCU tracepoints and functions that reference char pointers do
    so with just 'char *' even though they do not modify the contents of
    the string itself. This will cause warnings if a const char * is used
    in one of these functions.

    The RCU tracepoints store the pointer to the string to refer back to them
    when the trace output is displayed. As this can be minutes, hours or
    even days later, those strings had better be constant.

    This change also opens the door to allow the RCU tracepoint strings and
    their addresses to be exported so that userspace tracing tools can
    translate the contents of the pointers of the RCU tracepoints.

    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     

26 Mar, 2013

2 commits


29 Jan, 2013

1 commit


09 Jan, 2013

3 commits

  • This commit adds event tracing for callback acceleration to allow better
    tracking of callbacks through the system.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • When the type of global variable blimit changed from int to long, the
    type of the blimit argument of trace_rcu_batch_start() needed to have
    changed. This commit fixes this issue.

    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • Currently, rcutorture traces every read-side access. This can be
    problematic because even a two-minute rcutorture run on a two-CPU system
    can generate 28,853,363 reads. Normally, only a failing read is of
    interest, so this commit traces adjusts rcutorture's tracing to only
    trace failing reads. The resulting event tracing records the time
    and the ->completed value captured at the beginning of the RCU read-side
    critical section, allowing correlation with other event-tracing messages.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett
    [ paulmck: Add fix to build problem located by Randy Dunlap based on
    diagnosis by Steven Rostedt. ]

    Paul E. McKenney
     

17 Nov, 2012

1 commit

  • RCU callback execution can add significant OS jitter and also can
    degrade both scheduling latency and, in asymmetric multiprocessors,
    energy efficiency. This commit therefore adds the ability for selected
    CPUs ("rcu_nocbs=" boot parameter) to have their callbacks offloaded
    to kthreads. If the "rcu_nocb_poll" boot parameter is also specified,
    these kthreads will do polling, removing the need for the offloaded
    CPUs to do wakeups. At least one CPU must be doing normal callback
    processing: currently CPU 0 cannot be selected as a no-CBs CPU.
    In addition, attempts to offline the last normal-CBs CPU will fail.

    This feature was inspired by Jim Houston's and Joe Korty's JRCU, and
    this commit includes fixes to problems located by Fengguang Wu's
    kbuild test robot.

    [ paulmck: Added gfp.h include file as suggested by Fengguang Wu. ]

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

03 Jul, 2012

1 commit


07 Jun, 2012

1 commit

  • In the current code, a short dyntick-idle interval (where there is
    at least one non-lazy callback on the CPU) and a long dyntick-idle
    interval (where there are only lazy callbacks on the CPU) are traced
    identically, which can be less than helpful. This commit therefore
    emits different event traces in these two cases.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Tested-by: Heiko Carstens
    Tested-by: Pascal Chapperon

    Paul E. McKenney
     

10 May, 2012

1 commit

  • The current RCU_FAST_NO_HZ assumes that timers do not migrate unless a
    CPU goes offline, in which case it assumes that the CPU will have to come
    out of dyntick-idle mode (cancelling the timer) in order to go offline.
    This is important because when RCU_FAST_NO_HZ permits a CPU to enter
    dyntick-idle mode despite having RCU callbacks pending, it posts a timer
    on that CPU to force a wakeup on that CPU. This wakeup ensures that the
    CPU will eventually handle the end of the grace period, including invoking
    its RCU callbacks.

    However, Pascal Chapperon's test setup shows that the timer handler
    rcu_idle_gp_timer_func() really does get invoked in some cases. This is
    problematic because this can cause the CPU that entered dyntick-idle
    mode despite still having RCU callbacks pending to remain in
    dyntick-idle mode indefinitely, which means that its RCU callbacks might
    never be invoked. This situation can result in grace-period delays or
    even system hangs, which matches Pascal's observations of slow boot-up
    and shutdown (https://lkml.org/lkml/2012/4/5/142). See also the bugzilla:

    https://bugzilla.redhat.com/show_bug.cgi?id=806548

    This commit therefore causes the "should never be invoked" timer handler
    rcu_idle_gp_timer_func() to use smp_call_function_single() to wake up
    the CPU for which the timer was intended, allowing that CPU to invoke
    its RCU callbacks in a timely manner.

    Reported-by: Pascal Chapperon
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

25 Apr, 2012

1 commit


22 Feb, 2012

1 commit

  • When CONFIG_RCU_FAST_NO_HZ is enabled, RCU will allow a given CPU to
    enter dyntick-idle mode even if it still has RCU callbacks queued.
    RCU avoids system hangs in this case by scheduling a timer for several
    jiffies in the future. However, if all of the callbacks on that CPU
    are from kfree_rcu(), there is no reason to wake the CPU up, as it is
    not a problem to defer freeing of memory.

    This commit therefore tracks the number of callbacks on a given CPU
    that are from kfree_rcu(), and avoids scheduling the timer if all of
    a given CPU's callbacks are from kfree_rcu().

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

12 Dec, 2011

8 commits

  • The current rcu_batch_end event trace records only the name of the RCU
    flavor and the total number of callbacks that remain queued on the
    current CPU. This is insufficient for testing and tuning the new
    dyntick-idle RCU_FAST_NO_HZ code, so this commit adds idle state along
    with whether or not any of the callbacks that were ready to invoke
    at the beginning of rcu_do_batch() are still queued.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The current implementation of RCU_FAST_NO_HZ prevents CPUs from entering
    dyntick-idle state if they have RCU callbacks pending. Unfortunately,
    this has the side-effect of often preventing them from entering this
    state, especially if at least one other CPU is not in dyntick-idle state.
    However, the resulting per-tick wakeup is wasteful in many cases: if the
    CPU has already fully responded to the current RCU grace period, there
    will be nothing for it to do until this grace period ends, which will
    frequently take several jiffies.

    This commit therefore permits a CPU that has done everything that the
    current grace period has asked of it (rcu_pending() == 0) even if it
    still as RCU callbacks pending. However, such a CPU posts a timer to
    wake it up several jiffies later (6 jiffies, based on experience with
    grace-period lengths). This wakeup is required to handle situations
    that can result in all CPUs being in dyntick-idle mode, thus failing
    to ever complete the current grace period. If a CPU wakes up before
    the timer goes off, then it cancels that timer, thus avoiding spurious
    wakeups.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • With the new implementation of RCU_FAST_NO_HZ, it was possible to hang
    RCU grace periods as follows:

    o CPU 0 attempts to go idle, cycles several times through the
    rcu_prepare_for_idle() loop, then goes dyntick-idle when
    RCU needs nothing more from it, while still having at least
    on RCU callback pending.

    o CPU 1 goes idle with no callbacks.

    Both CPUs can then stay in dyntick-idle mode indefinitely, preventing
    the RCU grace period from ever completing, possibly hanging the system.

    This commit therefore prevents CPUs that have RCU callbacks from entering
    dyntick-idle mode. This approach also eliminates the need for the
    end-of-grace-period IPIs used previously.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This commit adds trace_rcu_prep_idle(), which is invoked from
    rcu_prepare_for_idle() and rcu_wake_cpu() to trace attempts on
    the part of RCU to force CPUs into dyntick-idle mode.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This commit updates the trace_rcu_dyntick() header comment to reflect
    events added by commit 4b4f421.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The trace_rcu_dyntick() trace event did not print both the old and
    the new value of the nesting level, and furthermore printed only
    the low-order 32 bits of it. This could result in some confusion
    when interpreting trace-event dumps, so this commit prints both
    the old and the new value, prints the full 64 bits, and also selects
    the process-entry/exit increment to print nicely in hexadecimal.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • Trace the rcutorture RCU accesses and dump the trace buffer when the
    first failure is detected.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • Earlier versions of RCU used the scheduling-clock tick to detect idleness
    by checking for the idle task, but handled idleness differently for
    CONFIG_NO_HZ=y. But there are now a number of uses of RCU read-side
    critical sections in the idle task, for example, for tracing. A more
    fine-grained detection of idleness is therefore required.

    This commit presses the old dyntick-idle code into full-time service,
    so that rcu_idle_enter(), previously known as rcu_enter_nohz(), is
    always invoked at the beginning of an idle loop iteration. Similarly,
    rcu_idle_exit(), previously known as rcu_exit_nohz(), is always invoked
    at the end of an idle-loop iteration. This allows the idle task to
    use RCU everywhere except between consecutive rcu_idle_enter() and
    rcu_idle_exit() calls, in turn allowing architecture maintainers to
    specify exactly where in the idle loop that RCU may be used.

    Because some of the userspace upcall uses can result in what looks
    to RCU like half of an interrupt, it is not possible to expect that
    the irq_enter() and irq_exit() hooks will give exact counts. This
    patch therefore expands the ->dynticks_nesting counter to 64 bits
    and uses two separate bitfields to count process/idle transitions
    and interrupt entry/exit transitions. It is presumed that userspace
    upcalls do not happen in the idle loop or from usermode execution
    (though usermode might do a system call that results in an upcall).
    The counter is hard-reset on each process/idle transition, which
    avoids the interrupt entry/exit error from accumulating. Overflow
    is avoided by the 64-bitness of the ->dyntick_nesting counter.

    This commit also adds warnings if a non-idle task asks RCU to enter
    idle state (and these checks will need some adjustment before applying
    Frederic's OS-jitter patches (http://lkml.org/lkml/2011/10/7/246).
    In addition, validation of ->dynticks and ->dynticks_nesting is added.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     

29 Sep, 2011

5 commits

  • Add trace events to record grace-period start and end, quiescent states,
    CPUs noticing grace-period start and end, grace-period initialization,
    call_rcu() invocation, tasks blocking in RCU read-side critical sections,
    tasks exiting those same critical sections, force_quiescent_state()
    detection of dyntick-idle and offline CPUs, CPUs entering and leaving
    dyntick-idle mode (except from NMIs), CPUs coming online and going
    offline, and CPUs being kicked for staying in dyntick-idle mode for too
    long (as in many weeks, even on 32-bit systems).

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    rcu: Add the rcu flavor to callback trace events

    The earlier trace events for registering RCU callbacks and for invoking
    them did not include the RCU flavor (rcu_bh, rcu_preempt, or rcu_sched).
    This commit adds the RCU flavor to those trace events.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Add event-trace markers to TREE_RCU kthreads to allow including these
    kthread's CPU time in the utilization calculations.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Add a string to the rcu_batch_start() and rcu_batch_end() trace
    messages that indicates the RCU type ("rcu_sched", "rcu_bh", or
    "rcu_preempt"). The trace messages for the actual invocations
    themselves are not marked, as it should be clear from the
    rcu_batch_start() and rcu_batch_end() events before and after.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This commit adds the trace_rcu_utilization() marker that is to be
    used to allow postprocessing scripts compute RCU's CPU utilization,
    give or take event-trace overhead. Note that we do not include RCU's
    dyntick-idle interface because event tracing requires RCU protection,
    which is not available in dyntick-idle mode.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • There was recently some controversy about the overhead of invoking RCU
    callbacks. Add TRACE_EVENT()s to obtain fine-grained timings for the
    start and stop of a batch of callbacks and also for each callback invoked.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney