Commit 165d6c78ee24127dde5c750b2af0a239f9c11d1a

Authored by Paul E. McKenney
Committed by Linus Torvalds
1 parent 76d42bd969

[PATCH] RCU documentation: self-limiting updates and call_rcu()

An update to the RCU documentation calling out the
self-limiting-update-rate advantages of synchronize_rcu(), and describing
how to use call_rcu() in a way that results in self-limiting updates.
Self-limiting updates are important to avoiding RCU-induced OOM in face of
denial-of-service attacks.

Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

Showing 2 changed files with 52 additions and 4 deletions Side-by-side Diff

Documentation/RCU/checklist.txt
... ... @@ -144,9 +144,47 @@
144 144 whether the increased speed is worth it.
145 145  
146 146 8. Although synchronize_rcu() is a bit slower than is call_rcu(),
147   - it usually results in simpler code. So, unless update performance
148   - is important or the updaters cannot block, synchronize_rcu()
149   - should be used in preference to call_rcu().
  147 + it usually results in simpler code. So, unless update
  148 + performance is critically important or the updaters cannot block,
  149 + synchronize_rcu() should be used in preference to call_rcu().
  150 +
  151 + An especially important property of the synchronize_rcu()
  152 + primitive is that it automatically self-limits: if grace periods
  153 + are delayed for whatever reason, then the synchronize_rcu()
  154 + primitive will correspondingly delay updates. In contrast,
  155 + code using call_rcu() should explicitly limit update rate in
  156 + cases where grace periods are delayed, as failing to do so can
  157 + result in excessive realtime latencies or even OOM conditions.
  158 +
  159 + Ways of gaining this self-limiting property when using call_rcu()
  160 + include:
  161 +
  162 + a. Keeping a count of the number of data-structure elements
  163 + used by the RCU-protected data structure, including those
  164 + waiting for a grace period to elapse. Enforce a limit
  165 + on this number, stalling updates as needed to allow
  166 + previously deferred frees to complete.
  167 +
  168 + Alternatively, limit only the number awaiting deferred
  169 + free rather than the total number of elements.
  170 +
  171 + b. Limiting update rate. For example, if updates occur only
  172 + once per hour, then no explicit rate limiting is required,
  173 + unless your system is already badly broken. The dcache
  174 + subsystem takes this approach -- updates are guarded
  175 + by a global lock, limiting their rate.
  176 +
  177 + c. Trusted update -- if updates can only be done manually by
  178 + superuser or some other trusted user, then it might not
  179 + be necessary to automatically limit them. The theory
  180 + here is that superuser already has lots of ways to crash
  181 + the machine.
  182 +
  183 + d. Use call_rcu_bh() rather than call_rcu(), in order to take
  184 + advantage of call_rcu_bh()'s faster grace periods.
  185 +
  186 + e. Periodically invoke synchronize_rcu(), permitting a limited
  187 + number of updates per grace period.
150 188  
151 189 9. All RCU list-traversal primitives, which include
152 190 list_for_each_rcu(), list_for_each_entry_rcu(),
Documentation/RCU/whatisRCU.txt
... ... @@ -184,7 +184,17 @@
184 184 blocking, it registers a function and argument which are invoked
185 185 after all ongoing RCU read-side critical sections have completed.
186 186 This callback variant is particularly useful in situations where
187   - it is illegal to block.
  187 + it is illegal to block or where update-side performance is
  188 + critically important.
  189 +
  190 + However, the call_rcu() API should not be used lightly, as use
  191 + of the synchronize_rcu() API generally results in simpler code.
  192 + In addition, the synchronize_rcu() API has the nice property
  193 + of automatically limiting update rate should grace periods
  194 + be delayed. This property results in system resilience in face
  195 + of denial-of-service attacks. Code using call_rcu() should limit
  196 + update rate in order to gain this same sort of resilience. See
  197 + checklist.txt for some approaches to limiting the update rate.
188 198  
189 199 rcu_assign_pointer()
190 200