29 Jan, 2013

1 commit

  • Tiny RCU has historically omitted RCU CPU stall warnings in order to
    reduce memory requirements, however, lack of these warnings caused
    Thomas Gleixner some debugging pain recently. Therefore, this commit
    adds RCU CPU stall warnings to tiny RCU if RCU_TRACE=y. This keeps
    the memory footprint small, while still enabling CPU stall warnings
    in kernels built to enable them.

    Updated to include Josh Triplett's suggested use of RCU_STALL_COMMON
    config variable to simplify #if expressions.

    Reported-by: Thomas Gleixner
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     

24 Oct, 2012

1 commit

  • There have been some embedded applications that would benefit from
    use of expedited grace-period primitives. In some ways, this is
    similar to synchronize_net() doing either a normal or an expedited
    grace period depending on lock state, but with control outside of
    the kernel.

    This commit therefore adds rcu_expedited boot and sysfs parameters
    that cause the kernel to substitute expedited primitives for the
    normal grace-period primitives.

    [ paulmck: Add trace/event/rcu.h to kernel/srcu.c to avoid build error.
    Get rid of infinite loop through contention path.]

    Signed-off-by: Antti P Miettinen
    Signed-off-by: Paul E. McKenney

    Antti P Miettinen
     

23 Sep, 2012

1 commit


06 Jul, 2012

2 commits

  • The Linux kernel coding style says that single-statement blocks should
    omit curly braces unless the other leg of the "if" statement has
    multiple statements, in which case the curly braces should be included.
    This commit fixes RCU's violations of this rule.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • …a' and 'fnh.2012.07.02a' into HEAD

    bigrtm: First steps towards getting RCU out of the way of
    tens-of-microseconds real-time response on systems compiled
    with NR_CPUS=4096. Also cleanups for and increased concurrency
    of rcu_barrier() family of primitives.
    doctorture: rcutorture and documentation improvements.
    fixes: Miscellaneous fixes.
    fnh: RCU_FAST_NO_HZ fixes and improvements.

    Paul E. McKenney
     

03 Jul, 2012

2 commits

  • The TINY_PREEMPT_RCU() function rcu_preempt_needs_cpu(), which is called
    from rcu_needs_cpu(), assumes that it is in a quiescent state with respect
    to the CPU. This is no longer the case. This commit therefore updates
    rcu_preempt_needs_cpu() to make it aware that it is not running in a
    quiescent state.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Tested-by: Heiko Carstens
    Tested-by: Pascal Chapperon

    Paul E. McKenney
     
  • The CONFIG_TREE_PREEMPT_RCU and CONFIG_TINY_PREEMPT_RCU versions of
    __rcu_read_lock() and __rcu_read_unlock() are identical, so this commit
    consolidates them into kernel/rcupdate.h.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     

03 May, 2012

1 commit

  • When running preemptible RCU, if a task exits in an RCU read-side
    critical section having blocked within that same RCU read-side critical
    section, the task must be removed from the list of tasks blocking a
    grace period (perhaps the current grace period, perhaps the next grace
    period, depending on timing). The exit() path invokes exit_rcu() to
    do this cleanup.

    However, the current implementation of exit_rcu() needlessly does the
    cleanup even if the task did not block within the current RCU read-side
    critical section, which wastes time and needlessly increases the size
    of the state space. Fix this by only doing the cleanup if the current
    task is actually on the list of tasks blocking some grace period.

    While we are at it, consolidate the two identical exit_rcu() functions
    into a single function.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Tested-by: Linus Torvalds

    Conflicts:

    kernel/rcupdate.c

    Paul E. McKenney
     

22 Feb, 2012

6 commits

  • This is a port of commit #82e78d80 from TREE_PREEMPT_RCU to
    TINY_PREEMPT_RCU.

    This commit uses the fact that current->rcu_boost_mutex is set
    any time that the RCU_READ_UNLOCK_BOOSTED flag is set in the
    current->rcu_read_unlock_special bitmask. This allows tests of
    the bit to be changed to tests of the pointer, which in turn allows
    the RCU_READ_UNLOCK_BOOSTED flag to be eliminated.

    Please note that the check of current->rcu_read_unlock_special need not
    change because any time that RCU_READ_UNLOCK_BOOSTED was set, so was
    RCU_READ_UNLOCK_BLOCKED. Therefore, __rcu_read_unlock() can continue
    testing current->rcu_read_unlock_special for non-zero, as before.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This is a port to TINY_RCU of Peter Zijlstra's commit #ec433f0c5

    The rcu_read_unlock_special() function relies on in_irq() to exclude
    scheduler activity from interrupt level. This fails because exit_irq()
    can invoke the scheduler after clearing the preempt_count() bits that
    in_irq() uses to determine that it is at interrupt level. This situation
    can result in failures as follows:

    $task IRQ SoftIRQ

    rcu_read_lock()

    /* do stuff */

    |= UNLOCK_BLOCKED

    rcu_read_unlock()
    --t->rcu_read_lock_nesting

    irq_enter();
    /* do stuff, don't use RCU */
    irq_exit();
    sub_preempt_count(IRQ_EXIT_OFFSET);
    invoke_softirq()

    ttwu();
    spin_lock_irq(&pi->lock)
    rcu_read_lock();
    /* do stuff */
    rcu_read_unlock();
    rcu_read_unlock_special()
    rcu_report_exp_rnp()
    ttwu()
    spin_lock_irq(&pi->lock) /* deadlock */

    rcu_read_unlock_special(t);

    This can be triggered 'easily' because invoke_softirq() immediately does
    a ttwu() of ksoftirqd/# instead of doing the in-place softirq stuff first,
    but even without that the above happens.

    Cure this by also excluding softirqs from the rcu_read_unlock_special()
    handler and ensuring the force_irqthreads ksoftirqd/# wakeup is done
    from full softirq context.

    It is also necessary to delay the ->rcu_read_lock_nesting decrement until
    after rcu_read_unlock_special(). This delay is handled by the commit
    "Protect __rcu_read_unlock() against scheduler-using irq handlers".

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This is a port of commit #b0d3041 from TREE_RCU to TREE_PREEMPT_RCU.

    Under some rare but real combinations of configuration parameters, RCU
    callbacks are posted during early boot that use kernel facilities that are
    not yet initialized. Therefore, when these callbacks are invoked, hard
    hangs and crashes ensue. This commit therefore prevents RCU callbacks
    from being invoked until after the scheduler is fully up and running,
    as in after multiple tasks have been spawned.

    It might well turn out that a better approach is to identify the specific
    RCU callbacks that are causing this problem, but that discussion will
    wait until such time as someone really needs an RCU callback to be invoked
    (as opposed to merely registered) during early boot.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This is a port of commit #be0e1e21 to TINY_PREEMPT_RCU. This uses
    noinline to prevent rcu_read_unlock_special() from being inlined into
    __rcu_read_unlock().

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This commit ports commit #10f39bb1b2 (rcu: protect __rcu_read_unlock()
    against scheduler-using irq handlers) from TREE_PREEMPT_RCU to
    TINY_PREEMPT_RCU. The following is a corresponding port of that
    commit message.

    The addition of RCU read-side critical sections within runqueue and
    priority-inheritance critical sections introduced some deadlocks,
    for example, involving interrupts from __rcu_read_unlock() where the
    interrupt handlers call wake_up(). This situation can cause the
    instance of __rcu_read_unlock() invoked from interrupt to do some
    of the processing that would otherwise have been carried out by the
    task-level instance of __rcu_read_unlock(). When the interrupt-level
    instance of __rcu_read_unlock() is called with a scheduler lock held from
    interrupt-entry/exit situations where in_irq() returns false, deadlock can
    result. Of course, in a UP kernel, there are not really any deadlocks,
    but the upper-level critical section can still be be fatally confused
    by the lower-level critical section changing things out from under it.

    This commit resolves these deadlocks by using negative values of the
    per-task ->rcu_read_lock_nesting counter to indicate that an instance of
    __rcu_read_unlock() is in flight, which in turn prevents instances from
    interrupt handlers from doing any special processing. Note that nested
    rcu_read_lock()/rcu_read_unlock() pairs are still permitted, but they will
    never see ->rcu_read_lock_nesting go to zero, and will therefore never
    invoke rcu_read_unlock_special(), thus preventing them from seeing the
    RCU_READ_UNLOCK_BLOCKED bit should it be set in ->rcu_read_unlock_special.
    This patch also adds a check for ->rcu_read_unlock_special being negative
    in rcu_check_callbacks(), thus preventing the RCU_READ_UNLOCK_NEED_QS
    bit from being set should a scheduling-clock interrupt occur while
    __rcu_read_unlock() is exiting from an outermost RCU read-side critical
    section.

    Of course, __rcu_read_unlock() can be preempted during the time that
    ->rcu_read_lock_nesting is negative. This could result in the setting
    of the RCU_READ_UNLOCK_BLOCKED bit after __rcu_read_unlock() checks it,
    and would also result it this task being queued on the corresponding
    rcu_node structure's blkd_tasks list. Therefore, some later RCU read-side
    critical section would enter rcu_read_unlock_special() to clean up --
    which could result in deadlock (OK, OK, fatal confusion) if that RCU
    read-side critical section happened to be in the scheduler where the
    runqueue or priority-inheritance locks were held.

    To prevent the possibility of fatal confusion that might result from
    preemption during the time that ->rcu_read_lock_nesting is negative,
    this commit also makes rcu_preempt_note_context_switch() check for
    negative ->rcu_read_lock_nesting, thus refraining from queuing the task
    (and from setting RCU_READ_UNLOCK_BLOCKED) if we are already exiting
    from the outermost RCU read-side critical section (in other words,
    we really are no longer actually in that RCU read-side critical
    section). In addition, rcu_preempt_note_context_switch() invokes
    rcu_read_unlock_special() to carry out the cleanup in this case, which
    clears out the ->rcu_read_unlock_special bits and dequeues the task
    (if necessary), in turn avoiding needless delay of the current RCU grace
    period and needless RCU priority boosting.

    It is still illegal to call rcu_read_unlock() while holding a scheduler
    lock if the prior RCU read-side critical section has ever had both
    preemption and irqs enabled. However, the common use case is legal,
    namely where then entire RCU read-side critical section executes with
    irqs disabled, for example, when the scheduler lock is held across the
    entire lifetime of the RCU read-side critical section.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • It is illegal to have a grace period within a same-flavor RCU read-side
    critical section, so this commit adds lockdep-RCU checks to splat when
    such abuse is encountered. This commit does not detect more elaborate
    RCU deadlock situations. These situations might be a job for lockdep
    enhancements.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

12 Dec, 2011

2 commits

  • Both TINY_RCU's and TREE_RCU's implementations of rcu_boost() access
    the ->boost_tasks and ->exp_tasks fields without preventing concurrent
    changes to these fields. This commit therefore applies ACCESS_ONCE in
    order to prevent compiler mischief.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The current rcu_batch_end event trace records only the name of the RCU
    flavor and the total number of callbacks that remain queued on the
    current CPU. This is insufficient for testing and tuning the new
    dyntick-idle RCU_FAST_NO_HZ code, so this commit adds idle state along
    with whether or not any of the callbacks that were ready to invoke
    at the beginning of rcu_do_batch() are still queued.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

31 Oct, 2011

1 commit


29 Sep, 2011

3 commits


06 May, 2011

3 commits

  • This applies a trick from TREE_RCU boosting to TINY_RCU, eliminating
    code and adding comments. The key point is that it is possible for
    the booster thread itself to work out whether there is a normal or
    expedited boost required based solely on local information. There
    is therefore no need for boost initiation to know or care what type
    of boosting is required. In addition, when boosting is complete for
    a given grace period, then by definition there cannot be any more
    boosting for that grace period. This allows eliminating yet more
    state and statistics.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • The ->boosted_this_gp field is a holdover from an earlier design that
    was to carry out multiple boost operations in parallel. It is not required
    by the current design, which boosts one task at a time.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Extraneous semicolon, bad comment, and fold INIT_LIST_HEAD() into
    list_del() to get list_del_init().

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     

05 Mar, 2011

1 commit


30 Nov, 2010

3 commits

  • RCU priority boosting's tracing did not distinguish between ongoing
    boosting and completion of boosting. This commit therefore adds this
    capability.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Add tracing for the tiny RCU implementations, including statistics on
    boosting in the case of TINY_PREEMPT_RCU and RCU_BOOST.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Add priority boosting, but only for TINY_PREEMPT_RCU. This is enabled
    by the default-off RCU_BOOST kernel parameter. The priority to which to
    boost preempted RCU readers is controlled by the RCU_BOOST_PRIO kernel
    parameter (defaulting to real-time priority 1) and the time to wait
    before boosting the readers blocking a given grace period is controlled
    by the RCU_BOOST_DELAY kernel parameter (defaulting to 500 milliseconds).

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

18 Nov, 2010

1 commit

  • If RCU priority boosting is to be meaningful, callback invocation must
    be boosted in addition to preempted RCU readers. Otherwise, in presence
    of CPU real-time threads, the grace period ends, but the callbacks don't
    get invoked. If the callbacks don't get invoked, the associated memory
    doesn't get freed, so the system is still subject to OOM.

    But it is not reasonable to priority-boost RCU_SOFTIRQ, so this commit
    moves the callback invocations to a kthread, which can be boosted easily.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

28 Aug, 2010

1 commit


21 Aug, 2010

1 commit


20 Aug, 2010

1 commit

  • Implement a small-memory-footprint uniprocessor-only implementation of
    preemptible RCU. This implementation uses but a single blocked-tasks
    list rather than the combinatorial number used per leaf rcu_node by
    TREE_PREEMPT_RCU, which reduces memory consumption and greatly simplifies
    processing. This version also takes advantage of uniprocessor execution
    to accelerate grace periods in the case where there are no readers.

    The general design is otherwise broadly similar to that of TREE_PREEMPT_RCU.

    This implementation is a step towards having RCU implementation driven
    off of the SMP and PREEMPT kernel configuration variables, which can
    happen once this implementation has accumulated sufficient experience.

    Removed ACCESS_ONCE() from __rcu_read_unlock() and added barrier() as
    suggested by Steve Rostedt in order to avoid the compiler-reordering
    issue noted by Mathieu Desnoyers (http://lkml.org/lkml/2010/8/16/183).

    As can be seen below, CONFIG_TINY_PREEMPT_RCU represents almost 5Kbyte
    savings compared to CONFIG_TREE_PREEMPT_RCU. Of course, for non-real-time
    workloads, CONFIG_TINY_RCU is even better.

    CONFIG_TREE_PREEMPT_RCU

    text data bss dec filename
    13 0 0 13 kernel/rcupdate.o
    6170 825 28 7023 kernel/rcutree.o
    ----
    7026 Total

    CONFIG_TINY_PREEMPT_RCU

    text data bss dec filename
    13 0 0 13 kernel/rcupdate.o
    2081 81 8 2170 kernel/rcutiny.o
    ----
    2183 Total

    CONFIG_TINY_RCU (non-preemptible)

    text data bss dec filename
    13 0 0 13 kernel/rcupdate.o
    719 25 0 744 kernel/rcutiny.o
    ---
    757 Total

    Requested-by: Loïc Minier
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

11 May, 2010

1 commit