Commit 4c54005ca438a8b46dd542b497d4f0dc2ca375e8
Committed by
Ingo Molnar
1 parent
b6407e8639
Exists in
master
and in
4 other branches
rcu: 1Q2010 update for RCU documentation
Add expedited functions. Review documentation and update obsolete verbiage. Also fix the advice for the RCU CPU-stall kernel configuration parameter, and document RCU CPU-stall warnings. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <12635142581866-git-send-email-> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Showing 9 changed files with 258 additions and 136 deletions Side-by-side Diff
Documentation/RCU/00-INDEX
... | ... | @@ -8,14 +8,18 @@ |
8 | 8 | - Using RCU to Protect Read-Mostly Linked Lists |
9 | 9 | NMI-RCU.txt |
10 | 10 | - Using RCU to Protect Dynamic NMI Handlers |
11 | +rcubarrier.txt | |
12 | + - RCU and Unloadable Modules | |
13 | +rculist_nulls.txt | |
14 | + - RCU list primitives for use with SLAB_DESTROY_BY_RCU | |
11 | 15 | rcuref.txt |
12 | 16 | - Reference-count design for elements of lists/arrays protected by RCU |
13 | 17 | rcu.txt |
14 | 18 | - RCU Concepts |
15 | -rcubarrier.txt | |
16 | - - Unloading modules that use RCU callbacks | |
17 | 19 | RTFP.txt |
18 | 20 | - List of RCU papers (bibliography) going back to 1980. |
21 | +stallwarn.txt | |
22 | + - RCU CPU stall warnings (CONFIG_RCU_CPU_STALL_DETECTOR) | |
19 | 23 | torture.txt |
20 | 24 | - RCU Torture Test Operation (CONFIG_RCU_TORTURE_TEST) |
21 | 25 | trace.txt |
Documentation/RCU/RTFP.txt
... | ... | @@ -25,10 +25,10 @@ |
25 | 25 | optimized for modern computer systems, which is not surprising given |
26 | 26 | that these overheads were not so expensive in the mid-80s. Nonetheless, |
27 | 27 | passive serialization appears to be the first deferred-destruction |
28 | -mechanism to be used in production. Furthermore, the relevant patent has | |
29 | -lapsed, so this approach may be used in non-GPL software, if desired. | |
30 | -(In contrast, use of RCU is permitted only in software licensed under | |
31 | -GPL. Sorry!!!) | |
28 | +mechanism to be used in production. Furthermore, the relevant patent | |
29 | +has lapsed, so this approach may be used in non-GPL software, if desired. | |
30 | +(In contrast, implementation of RCU is permitted only in software licensed | |
31 | +under either GPL or LGPL. Sorry!!!) | |
32 | 32 | |
33 | 33 | In 1990, Pugh [Pugh90] noted that explicitly tracking which threads |
34 | 34 | were reading a given data structure permitted deferred free to operate |
... | ... | @@ -150,6 +150,18 @@ |
150 | 150 | LWN "What is RCU?" series [PaulEMcKenney2007WhatIsRCUFundamentally, |
151 | 151 | PaulEMcKenney2008WhatIsRCUUsage, and PaulEMcKenney2008WhatIsRCUAPI]. |
152 | 152 | |
153 | +2008 saw a journal paper on real-time RCU [DinakarGuniguntala2008IBMSysJ], | |
154 | +a history of how Linux changed RCU more than RCU changed Linux | |
155 | +[PaulEMcKenney2008RCUOSR], and a design overview of hierarchical RCU | |
156 | +[PaulEMcKenney2008HierarchicalRCU]. | |
157 | + | |
158 | +2009 introduced user-level RCU algorithms [PaulEMcKenney2009MaliciousURCU], | |
159 | +which Mathieu Desnoyers is now maintaining [MathieuDesnoyers2009URCU] | |
160 | +[MathieuDesnoyersPhD]. TINY_RCU [PaulEMcKenney2009BloatWatchRCU] made | |
161 | +its appearance, as did expedited RCU [PaulEMcKenney2009expeditedRCU]. | |
162 | +The problem of resizeable RCU-protected hash tables may now be on a path | |
163 | +to a solution [JoshTriplett2009RPHash]. | |
164 | + | |
153 | 165 | Bibtex Entries |
154 | 166 | |
155 | 167 | @article{Kung80 |
... | ... | @@ -730,6 +742,11 @@ |
730 | 742 | " |
731 | 743 | } |
732 | 744 | |
745 | +# | |
746 | +# "What is RCU?" LWN series. | |
747 | +# | |
748 | +######################################################################## | |
749 | + | |
733 | 750 | @article{DinakarGuniguntala2008IBMSysJ |
734 | 751 | ,author="D. Guniguntala and P. E. McKenney and J. Triplett and J. Walpole" |
735 | 752 | ,title="The read-copy-update mechanism for supporting real-time applications on shared-memory multiprocessor systems with {Linux}" |
... | ... | @@ -819,5 +836,38 @@ |
819 | 836 | ,annotation=" |
820 | 837 | Uniprocessor assumptions allow simplified RCU implementation. |
821 | 838 | " |
839 | +} | |
840 | + | |
841 | +@unpublished{PaulEMcKenney2009expeditedRCU | |
842 | +,Author="Paul E. McKenney" | |
843 | +,Title="[{PATCH} -tip 0/3] expedited 'big hammer' {RCU} grace periods" | |
844 | +,month="June" | |
845 | +,day="25" | |
846 | +,year="2009" | |
847 | +,note="Available: | |
848 | +\url{http://lkml.org/lkml/2009/6/25/306} | |
849 | +[Viewed August 16, 2009]" | |
850 | +,annotation=" | |
851 | + First posting of expedited RCU to be accepted into -tip. | |
852 | +" | |
853 | +} | |
854 | + | |
855 | +@unpublished{JoshTriplett2009RPHash | |
856 | +,Author="Josh Triplett" | |
857 | +,Title="Scalable concurrent hash tables via relativistic programming" | |
858 | +,month="September" | |
859 | +,year="2009" | |
860 | +,note="Linux Plumbers Conference presentation" | |
861 | +,annotation=" | |
862 | + RP fun with hash tables. | |
863 | +" | |
864 | +} | |
865 | + | |
866 | +@phdthesis{MathieuDesnoyersPhD | |
867 | +, title = "Low-impact Operating System Tracing" | |
868 | +, author = "Mathieu Desnoyers" | |
869 | +, school = "Ecole Polytechnique de Montr\'{e}al" | |
870 | +, month = "December" | |
871 | +, year = 2009 | |
822 | 872 | } |
Documentation/RCU/checklist.txt
... | ... | @@ -8,13 +8,12 @@ |
8 | 8 | over a rather long period of time, but improvements are always welcome! |
9 | 9 | |
10 | 10 | 0. Is RCU being applied to a read-mostly situation? If the data |
11 | - structure is updated more than about 10% of the time, then | |
12 | - you should strongly consider some other approach, unless | |
13 | - detailed performance measurements show that RCU is nonetheless | |
14 | - the right tool for the job. Yes, you might think of RCU | |
15 | - as simply cutting overhead off of the readers and imposing it | |
16 | - on the writers. That is exactly why normal uses of RCU will | |
17 | - do much more reading than updating. | |
11 | + structure is updated more than about 10% of the time, then you | |
12 | + should strongly consider some other approach, unless detailed | |
13 | + performance measurements show that RCU is nonetheless the right | |
14 | + tool for the job. Yes, RCU does reduce read-side overhead by | |
15 | + increasing write-side overhead, which is exactly why normal uses | |
16 | + of RCU will do much more reading than updating. | |
18 | 17 | |
19 | 18 | Another exception is where performance is not an issue, and RCU |
20 | 19 | provides a simpler implementation. An example of this situation |
... | ... | @@ -35,13 +34,13 @@ |
35 | 34 | |
36 | 35 | If you choose #b, be prepared to describe how you have handled |
37 | 36 | memory barriers on weakly ordered machines (pretty much all of |
38 | - them -- even x86 allows reads to be reordered), and be prepared | |
39 | - to explain why this added complexity is worthwhile. If you | |
40 | - choose #c, be prepared to explain how this single task does not | |
41 | - become a major bottleneck on big multiprocessor machines (for | |
42 | - example, if the task is updating information relating to itself | |
43 | - that other tasks can read, there by definition can be no | |
44 | - bottleneck). | |
37 | + them -- even x86 allows later loads to be reordered to precede | |
38 | + earlier stores), and be prepared to explain why this added | |
39 | + complexity is worthwhile. If you choose #c, be prepared to | |
40 | + explain how this single task does not become a major bottleneck on | |
41 | + big multiprocessor machines (for example, if the task is updating | |
42 | + information relating to itself that other tasks can read, there | |
43 | + by definition can be no bottleneck). | |
45 | 44 | |
46 | 45 | 2. Do the RCU read-side critical sections make proper use of |
47 | 46 | rcu_read_lock() and friends? These primitives are needed |
... | ... | @@ -51,8 +50,10 @@ |
51 | 50 | actuarial risk of your kernel. |
52 | 51 | |
53 | 52 | As a rough rule of thumb, any dereference of an RCU-protected |
54 | - pointer must be covered by rcu_read_lock() or rcu_read_lock_bh() | |
55 | - or by the appropriate update-side lock. | |
53 | + pointer must be covered by rcu_read_lock(), rcu_read_lock_bh(), | |
54 | + rcu_read_lock_sched(), or by the appropriate update-side lock. | |
55 | + Disabling of preemption can serve as rcu_read_lock_sched(), but | |
56 | + is less readable. | |
56 | 57 | |
57 | 58 | 3. Does the update code tolerate concurrent accesses? |
58 | 59 | |
59 | 60 | |
60 | 61 | |
... | ... | @@ -62,25 +63,27 @@ |
62 | 63 | of ways to handle this concurrency, depending on the situation: |
63 | 64 | |
64 | 65 | a. Use the RCU variants of the list and hlist update |
65 | - primitives to add, remove, and replace elements on an | |
66 | - RCU-protected list. Alternatively, use the RCU-protected | |
67 | - trees that have been added to the Linux kernel. | |
66 | + primitives to add, remove, and replace elements on | |
67 | + an RCU-protected list. Alternatively, use the other | |
68 | + RCU-protected data structures that have been added to | |
69 | + the Linux kernel. | |
68 | 70 | |
69 | 71 | This is almost always the best approach. |
70 | 72 | |
71 | 73 | b. Proceed as in (a) above, but also maintain per-element |
72 | 74 | locks (that are acquired by both readers and writers) |
73 | 75 | that guard per-element state. Of course, fields that |
74 | - the readers refrain from accessing can be guarded by the | |
75 | - update-side lock. | |
76 | + the readers refrain from accessing can be guarded by | |
77 | + some other lock acquired only by updaters, if desired. | |
76 | 78 | |
77 | 79 | This works quite well, also. |
78 | 80 | |
79 | 81 | c. Make updates appear atomic to readers. For example, |
80 | - pointer updates to properly aligned fields will appear | |
81 | - atomic, as will individual atomic primitives. Operations | |
82 | - performed under a lock and sequences of multiple atomic | |
83 | - primitives will -not- appear to be atomic. | |
82 | + pointer updates to properly aligned fields will | |
83 | + appear atomic, as will individual atomic primitives. | |
84 | + Sequences of perations performed under a lock will -not- | |
85 | + appear to be atomic to RCU readers, nor will sequences | |
86 | + of multiple atomic primitives. | |
84 | 87 | |
85 | 88 | This can work, but is starting to get a bit tricky. |
86 | 89 | |
... | ... | @@ -98,9 +101,9 @@ |
98 | 101 | a new structure containing updated values. |
99 | 102 | |
100 | 103 | 4. Weakly ordered CPUs pose special challenges. Almost all CPUs |
101 | - are weakly ordered -- even i386 CPUs allow reads to be reordered. | |
102 | - RCU code must take all of the following measures to prevent | |
103 | - memory-corruption problems: | |
104 | + are weakly ordered -- even x86 CPUs allow later loads to be | |
105 | + reordered to precede earlier stores. RCU code must take all of | |
106 | + the following measures to prevent memory-corruption problems: | |
104 | 107 | |
105 | 108 | a. Readers must maintain proper ordering of their memory |
106 | 109 | accesses. The rcu_dereference() primitive ensures that |
107 | 110 | |
... | ... | @@ -113,14 +116,21 @@ |
113 | 116 | The rcu_dereference() primitive is also an excellent |
114 | 117 | documentation aid, letting the person reading the code |
115 | 118 | know exactly which pointers are protected by RCU. |
119 | + Please note that compilers can also reorder code, and | |
120 | + they are becoming increasingly aggressive about doing | |
121 | + just that. The rcu_dereference() primitive therefore | |
122 | + also prevents destructive compiler optimizations. | |
116 | 123 | |
117 | - The rcu_dereference() primitive is used by the various | |
118 | - "_rcu()" list-traversal primitives, such as the | |
119 | - list_for_each_entry_rcu(). Note that it is perfectly | |
120 | - legal (if redundant) for update-side code to use | |
121 | - rcu_dereference() and the "_rcu()" list-traversal | |
122 | - primitives. This is particularly useful in code | |
123 | - that is common to readers and updaters. | |
124 | + The rcu_dereference() primitive is used by the | |
125 | + various "_rcu()" list-traversal primitives, such | |
126 | + as the list_for_each_entry_rcu(). Note that it is | |
127 | + perfectly legal (if redundant) for update-side code to | |
128 | + use rcu_dereference() and the "_rcu()" list-traversal | |
129 | + primitives. This is particularly useful in code that | |
130 | + is common to readers and updaters. However, neither | |
131 | + rcu_dereference() nor the "_rcu()" list-traversal | |
132 | + primitives can substitute for a good concurrency design | |
133 | + coordinating among multiple updaters. | |
124 | 134 | |
125 | 135 | b. If the list macros are being used, the list_add_tail_rcu() |
126 | 136 | and list_add_rcu() primitives must be used in order |
127 | 137 | |
... | ... | @@ -135,11 +145,14 @@ |
135 | 145 | readers. Similarly, if the hlist macros are being used, |
136 | 146 | the hlist_del_rcu() primitive is required. |
137 | 147 | |
138 | - The list_replace_rcu() primitive may be used to | |
139 | - replace an old structure with a new one in an | |
140 | - RCU-protected list. | |
148 | + The list_replace_rcu() and hlist_replace_rcu() primitives | |
149 | + may be used to replace an old structure with a new one | |
150 | + in their respective types of RCU-protected lists. | |
141 | 151 | |
142 | - d. Updates must ensure that initialization of a given | |
152 | + d. Rules similar to (4b) and (4c) apply to the "hlist_nulls" | |
153 | + type of RCU-protected linked lists. | |
154 | + | |
155 | + e. Updates must ensure that initialization of a given | |
143 | 156 | structure happens before pointers to that structure are |
144 | 157 | publicized. Use the rcu_assign_pointer() primitive |
145 | 158 | when publicizing a pointer to a structure that can |
146 | 159 | |
147 | 160 | |
... | ... | @@ -151,17 +164,32 @@ |
151 | 164 | it cannot block. |
152 | 165 | |
153 | 166 | 6. Since synchronize_rcu() can block, it cannot be called from |
154 | - any sort of irq context. Ditto for synchronize_sched() and | |
155 | - synchronize_srcu(). | |
167 | + any sort of irq context. The same rule applies for | |
168 | + synchronize_rcu_bh(), synchronize_sched(), synchronize_srcu(), | |
169 | + synchronize_rcu_expedited(), synchronize_rcu_bh_expedited(), | |
170 | + synchronize_sched_expedite(), and synchronize_srcu_expedited(). | |
156 | 171 | |
157 | -7. If the updater uses call_rcu(), then the corresponding readers | |
158 | - must use rcu_read_lock() and rcu_read_unlock(). If the updater | |
159 | - uses call_rcu_bh(), then the corresponding readers must use | |
160 | - rcu_read_lock_bh() and rcu_read_unlock_bh(). If the updater | |
161 | - uses call_rcu_sched(), then the corresponding readers must | |
162 | - disable preemption. Mixing things up will result in confusion | |
163 | - and broken kernels. | |
172 | + The expedited forms of these primitives have the same semantics | |
173 | + as the non-expedited forms, but expediting is both expensive | |
174 | + and unfriendly to real-time workloads. Use of the expedited | |
175 | + primitives should be restricted to rare configuration-change | |
176 | + operations that would not normally be undertaken while a real-time | |
177 | + workload is running. | |
164 | 178 | |
179 | +7. If the updater uses call_rcu() or synchronize_rcu(), then the | |
180 | + corresponding readers must use rcu_read_lock() and | |
181 | + rcu_read_unlock(). If the updater uses call_rcu_bh() or | |
182 | + synchronize_rcu_bh(), then the corresponding readers must | |
183 | + use rcu_read_lock_bh() and rcu_read_unlock_bh(). If the | |
184 | + updater uses call_rcu_sched() or synchronize_sched(), then | |
185 | + the corresponding readers must disable preemption, possibly | |
186 | + by calling rcu_read_lock_sched() and rcu_read_unlock_sched(). | |
187 | + If the updater uses synchronize_srcu(), the the corresponding | |
188 | + readers must use srcu_read_lock() and srcu_read_unlock(), | |
189 | + and with the same srcu_struct. The rules for the expedited | |
190 | + primitives are the same as for their non-expedited counterparts. | |
191 | + Mixing things up will result in confusion and broken kernels. | |
192 | + | |
165 | 193 | One exception to this rule: rcu_read_lock() and rcu_read_unlock() |
166 | 194 | may be substituted for rcu_read_lock_bh() and rcu_read_unlock_bh() |
167 | 195 | in cases where local bottom halves are already known to be |
... | ... | @@ -212,6 +240,8 @@ |
212 | 240 | e. Periodically invoke synchronize_rcu(), permitting a limited |
213 | 241 | number of updates per grace period. |
214 | 242 | |
243 | + The same cautions apply to call_rcu_bh() and call_rcu_sched(). | |
244 | + | |
215 | 245 | 9. All RCU list-traversal primitives, which include |
216 | 246 | rcu_dereference(), list_for_each_entry_rcu(), |
217 | 247 | list_for_each_continue_rcu(), and list_for_each_safe_rcu(), |
... | ... | @@ -229,7 +259,8 @@ |
229 | 259 | 10. Conversely, if you are in an RCU read-side critical section, |
230 | 260 | and you don't hold the appropriate update-side lock, you -must- |
231 | 261 | use the "_rcu()" variants of the list macros. Failing to do so |
232 | - will break Alpha and confuse people reading your code. | |
262 | + will break Alpha, cause aggressive compilers to generate bad code, | |
263 | + and confuse people trying to read your code. | |
233 | 264 | |
234 | 265 | 11. Note that synchronize_rcu() -only- guarantees to wait until |
235 | 266 | all currently executing rcu_read_lock()-protected RCU read-side |
236 | 267 | |
237 | 268 | |
... | ... | @@ -239,15 +270,21 @@ |
239 | 270 | rcu_read_lock()-protected read-side critical sections, do -not- |
240 | 271 | use synchronize_rcu(). |
241 | 272 | |
242 | - If you want to wait for some of these other things, you might | |
243 | - instead need to use synchronize_irq() or synchronize_sched(). | |
273 | + Similarly, disabling preemption is not an acceptable substitute | |
274 | + for rcu_read_lock(). Code that attempts to use preemption | |
275 | + disabling where it should be using rcu_read_lock() will break | |
276 | + in real-time kernel builds. | |
244 | 277 | |
278 | + If you want to wait for interrupt handlers, NMI handlers, and | |
279 | + code under the influence of preempt_disable(), you instead | |
280 | + need to use synchronize_irq() or synchronize_sched(). | |
281 | + | |
245 | 282 | 12. Any lock acquired by an RCU callback must be acquired elsewhere |
246 | 283 | with softirq disabled, e.g., via spin_lock_irqsave(), |
247 | 284 | spin_lock_bh(), etc. Failing to disable irq on a given |
248 | - acquisition of that lock will result in deadlock as soon as the | |
249 | - RCU callback happens to interrupt that acquisition's critical | |
250 | - section. | |
285 | + acquisition of that lock will result in deadlock as soon as | |
286 | + the RCU softirq handler happens to run your RCU callback while | |
287 | + interrupting that acquisition's critical section. | |
251 | 288 | |
252 | 289 | 13. RCU callbacks can be and are executed in parallel. In many cases, |
253 | 290 | the callback code simply wrappers around kfree(), so that this |
254 | 291 | |
... | ... | @@ -265,29 +302,30 @@ |
265 | 302 | not the case, a self-spawning RCU callback would prevent the |
266 | 303 | victim CPU from ever going offline.) |
267 | 304 | |
268 | -14. SRCU (srcu_read_lock(), srcu_read_unlock(), and synchronize_srcu()) | |
269 | - may only be invoked from process context. Unlike other forms of | |
270 | - RCU, it -is- permissible to block in an SRCU read-side critical | |
271 | - section (demarked by srcu_read_lock() and srcu_read_unlock()), | |
272 | - hence the "SRCU": "sleepable RCU". Please note that if you | |
273 | - don't need to sleep in read-side critical sections, you should | |
274 | - be using RCU rather than SRCU, because RCU is almost always | |
275 | - faster and easier to use than is SRCU. | |
305 | +14. SRCU (srcu_read_lock(), srcu_read_unlock(), synchronize_srcu(), | |
306 | + and synchronize_srcu_expedited()) may only be invoked from | |
307 | + process context. Unlike other forms of RCU, it -is- permissible | |
308 | + to block in an SRCU read-side critical section (demarked by | |
309 | + srcu_read_lock() and srcu_read_unlock()), hence the "SRCU": | |
310 | + "sleepable RCU". Please note that if you don't need to sleep | |
311 | + in read-side critical sections, you should be using RCU rather | |
312 | + than SRCU, because RCU is almost always faster and easier to | |
313 | + use than is SRCU. | |
276 | 314 | |
277 | 315 | Also unlike other forms of RCU, explicit initialization |
278 | 316 | and cleanup is required via init_srcu_struct() and |
279 | 317 | cleanup_srcu_struct(). These are passed a "struct srcu_struct" |
280 | 318 | that defines the scope of a given SRCU domain. Once initialized, |
281 | 319 | the srcu_struct is passed to srcu_read_lock(), srcu_read_unlock() |
282 | - and synchronize_srcu(). A given synchronize_srcu() waits only | |
283 | - for SRCU read-side critical sections governed by srcu_read_lock() | |
284 | - and srcu_read_unlock() calls that have been passd the same | |
285 | - srcu_struct. This property is what makes sleeping read-side | |
286 | - critical sections tolerable -- a given subsystem delays only | |
287 | - its own updates, not those of other subsystems using SRCU. | |
288 | - Therefore, SRCU is less prone to OOM the system than RCU would | |
289 | - be if RCU's read-side critical sections were permitted to | |
290 | - sleep. | |
320 | + synchronize_srcu(), and synchronize_srcu_expedited(). A given | |
321 | + synchronize_srcu() waits only for SRCU read-side critical | |
322 | + sections governed by srcu_read_lock() and srcu_read_unlock() | |
323 | + calls that have been passed the same srcu_struct. This property | |
324 | + is what makes sleeping read-side critical sections tolerable -- | |
325 | + a given subsystem delays only its own updates, not those of other | |
326 | + subsystems using SRCU. Therefore, SRCU is less prone to OOM the | |
327 | + system than RCU would be if RCU's read-side critical sections | |
328 | + were permitted to sleep. | |
291 | 329 | |
292 | 330 | The ability to sleep in read-side critical sections does not |
293 | 331 | come for free. First, corresponding srcu_read_lock() and |
294 | 332 | |
... | ... | @@ -311,13 +349,13 @@ |
311 | 349 | destructive operation, and -only- -then- invoke call_rcu(), |
312 | 350 | synchronize_rcu(), or friends. |
313 | 351 | |
314 | - Because these primitives only wait for pre-existing readers, | |
315 | - it is the caller's responsibility to guarantee safety to | |
316 | - any subsequent readers. | |
352 | + Because these primitives only wait for pre-existing readers, it | |
353 | + is the caller's responsibility to guarantee that any subsequent | |
354 | + readers will execute safely. | |
317 | 355 | |
318 | -16. The various RCU read-side primitives do -not- contain memory | |
319 | - barriers. The CPU (and in some cases, the compiler) is free | |
320 | - to reorder code into and out of RCU read-side critical sections. | |
321 | - It is the responsibility of the RCU update-side primitives to | |
322 | - deal with this. | |
356 | +16. The various RCU read-side primitives do -not- necessarily contain | |
357 | + memory barriers. You should therefore plan for the CPU | |
358 | + and the compiler to freely reorder code into and out of RCU | |
359 | + read-side critical sections. It is the responsibility of the | |
360 | + RCU update-side primitives to deal with this. |
Documentation/RCU/rcu.txt
... | ... | @@ -75,6 +75,8 @@ |
75 | 75 | search for the string "Patent" in RTFP.txt to find them. |
76 | 76 | Of these, one was allowed to lapse by the assignee, and the |
77 | 77 | others have been contributed to the Linux kernel under GPL. |
78 | + There are now also LGPL implementations of user-level RCU | |
79 | + available (http://lttng.org/?q=node/18). | |
78 | 80 | |
79 | 81 | o I hear that RCU needs work in order to support realtime kernels? |
80 | 82 | |
... | ... | @@ -91,49 +93,5 @@ |
91 | 93 | |
92 | 94 | o What are all these files in this directory? |
93 | 95 | |
94 | - | |
95 | - NMI-RCU.txt | |
96 | - | |
97 | - Describes how to use RCU to implement dynamic | |
98 | - NMI handlers, which can be revectored on the fly, | |
99 | - without rebooting. | |
100 | - | |
101 | - RTFP.txt | |
102 | - | |
103 | - List of RCU-related publications and web sites. | |
104 | - | |
105 | - UP.txt | |
106 | - | |
107 | - Discussion of RCU usage in UP kernels. | |
108 | - | |
109 | - arrayRCU.txt | |
110 | - | |
111 | - Describes how to use RCU to protect arrays, with | |
112 | - resizeable arrays whose elements reference other | |
113 | - data structures being of the most interest. | |
114 | - | |
115 | - checklist.txt | |
116 | - | |
117 | - Lists things to check for when inspecting code that | |
118 | - uses RCU. | |
119 | - | |
120 | - listRCU.txt | |
121 | - | |
122 | - Describes how to use RCU to protect linked lists. | |
123 | - This is the simplest and most common use of RCU | |
124 | - in the Linux kernel. | |
125 | - | |
126 | - rcu.txt | |
127 | - | |
128 | - You are reading it! | |
129 | - | |
130 | - rcuref.txt | |
131 | - | |
132 | - Describes how to combine use of reference counts | |
133 | - with RCU. | |
134 | - | |
135 | - whatisRCU.txt | |
136 | - | |
137 | - Overview of how the RCU implementation works. Along | |
138 | - the way, presents a conceptual view of RCU. | |
96 | + See 00-INDEX for the list. |
Documentation/RCU/stallwarn.txt
1 | +Using RCU's CPU Stall Detector | |
2 | + | |
3 | +The CONFIG_RCU_CPU_STALL_DETECTOR kernel config parameter enables | |
4 | +RCU's CPU stall detector, which detects conditions that unduly delay | |
5 | +RCU grace periods. The stall detector's idea of what constitutes | |
6 | +"unduly delayed" is controlled by a pair of C preprocessor macros: | |
7 | + | |
8 | +RCU_SECONDS_TILL_STALL_CHECK | |
9 | + | |
10 | + This macro defines the period of time that RCU will wait from | |
11 | + the beginning of a grace period until it issues an RCU CPU | |
12 | + stall warning. It is normally ten seconds. | |
13 | + | |
14 | +RCU_SECONDS_TILL_STALL_RECHECK | |
15 | + | |
16 | + This macro defines the period of time that RCU will wait after | |
17 | + issuing a stall warning until it issues another stall warning. | |
18 | + It is normally set to thirty seconds. | |
19 | + | |
20 | +RCU_STALL_RAT_DELAY | |
21 | + | |
22 | + The CPU stall detector tries to make the offending CPU rat on itself, | |
23 | + as this often gives better-quality stack traces. However, if | |
24 | + the offending CPU does not detect its own stall in the number | |
25 | + of jiffies specified by RCU_STALL_RAT_DELAY, then other CPUs will | |
26 | + complain. This is normally set to two jiffies. | |
27 | + | |
28 | +The following problems can result in an RCU CPU stall warning: | |
29 | + | |
30 | +o A CPU looping in an RCU read-side critical section. | |
31 | + | |
32 | +o A CPU looping with interrupts disabled. | |
33 | + | |
34 | +o A CPU looping with preemption disabled. | |
35 | + | |
36 | +o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel | |
37 | + without invoking schedule(). | |
38 | + | |
39 | +o A bug in the RCU implementation. | |
40 | + | |
41 | +o A hardware failure. This is quite unlikely, but has occurred | |
42 | + at least once in a former life. A CPU failed in a running system, | |
43 | + becoming unresponsive, but not causing an immediate crash. | |
44 | + This resulted in a series of RCU CPU stall warnings, eventually | |
45 | + leading the realization that the CPU had failed. | |
46 | + | |
47 | +The RCU, RCU-sched, and RCU-bh implementations have CPU stall warning. | |
48 | +SRCU does not do so directly, but its calls to synchronize_sched() will | |
49 | +result in RCU-sched detecting any CPU stalls that might be occurring. | |
50 | + | |
51 | +To diagnose the cause of the stall, inspect the stack traces. The offending | |
52 | +function will usually be near the top of the stack. If you have a series | |
53 | +of stall warnings from a single extended stall, comparing the stack traces | |
54 | +can often help determine where the stall is occurring, which will usually | |
55 | +be in the function nearest the top of the stack that stays the same from | |
56 | +trace to trace. | |
57 | + | |
58 | +RCU bugs can often be debugged with the help of CONFIG_RCU_TRACE. |
Documentation/RCU/torture.txt
... | ... | @@ -30,6 +30,18 @@ |
30 | 30 | |
31 | 31 | This module has the following parameters: |
32 | 32 | |
33 | +fqs_duration Duration (in microseconds) of artificially induced bursts | |
34 | + of force_quiescent_state() invocations. In RCU | |
35 | + implementations having force_quiescent_state(), these | |
36 | + bursts help force races between forcing a given grace | |
37 | + period and that grace period ending on its own. | |
38 | + | |
39 | +fqs_holdoff Holdoff time (in microseconds) between consecutive calls | |
40 | + to force_quiescent_state() within a burst. | |
41 | + | |
42 | +fqs_stutter Wait time (in seconds) between consecutive bursts | |
43 | + of calls to force_quiescent_state(). | |
44 | + | |
33 | 45 | irqreaders Says to invoke RCU readers from irq level. This is currently |
34 | 46 | done via timers. Defaults to "1" for variants of RCU that |
35 | 47 | permit this. (Or, more accurately, variants of RCU that do |
Documentation/RCU/whatisRCU.txt
... | ... | @@ -327,7 +327,8 @@ |
327 | 327 | |
328 | 328 | b. call_rcu_bh() rcu_read_lock_bh() / rcu_read_unlock_bh() |
329 | 329 | |
330 | -c. synchronize_sched() preempt_disable() / preempt_enable() | |
330 | +c. synchronize_sched() rcu_read_lock_sched() / rcu_read_unlock_sched() | |
331 | + preempt_disable() / preempt_enable() | |
331 | 332 | local_irq_save() / local_irq_restore() |
332 | 333 | hardirq enter / hardirq exit |
333 | 334 | NMI enter / NMI exit |
Documentation/filesystems/dentry-locking.txt
... | ... | @@ -62,7 +62,8 @@ |
62 | 62 | 2. Insertion of a dentry into the hash table is done using |
63 | 63 | hlist_add_head_rcu() which take care of ordering the writes - the |
64 | 64 | writes to the dentry must be visible before the dentry is |
65 | - inserted. This works in conjunction with hlist_for_each_rcu() while | |
65 | + inserted. This works in conjunction with hlist_for_each_rcu(), | |
66 | + which has since been replaced by hlist_for_each_entry_rcu(), while | |
66 | 67 | walking the hash chain. The only requirement is that all |
67 | 68 | initialization to the dentry must be done before |
68 | 69 | hlist_add_head_rcu() since we don't have dcache_lock protection |
lib/Kconfig.debug
... | ... | @@ -765,9 +765,9 @@ |
765 | 765 | CPUs are delaying the current grace period, but only when |
766 | 766 | the grace period extends for excessive time periods. |
767 | 767 | |
768 | - Say Y if you want RCU to perform such checks. | |
768 | + Say N if you want to disable such checks. | |
769 | 769 | |
770 | - Say N if you are unsure. | |
770 | + Say Y if you are unsure. | |
771 | 771 | |
772 | 772 | config KPROBES_SANITY_TEST |
773 | 773 | bool "Kprobes sanity tests" |