Blame view
Documentation/kref.txt
8.87 KB
d6ac1c7e2 kref.txt: standar... |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
=================================================== Adding reference counters (krefs) to kernel objects =================================================== :Author: Corey Minyard <minyard@acm.org> :Author: Thomas Hellstrom <thellstrom@vmware.com> A lot of this was lifted from Greg Kroah-Hartman's 2004 OLS paper and presentation on krefs, which can be found at: - http://www.kroah.com/linux/talks/ols_2004_kref_paper/Reprint-Kroah-Hartman-OLS2004.pdf - http://www.kroah.com/linux/talks/ols_2004_kref_talk/ Introduction ============ |
5c11c5204 [PATCH] kref: add... |
16 17 18 19 20 |
krefs allow you to add reference counters to your objects. If you have objects that are used in multiple places and passed around, and you don't have refcounts, your code is almost certainly broken. If you want refcounts, krefs are the way to go. |
d6ac1c7e2 kref.txt: standar... |
21 |
To use a kref, add one to your data structures like:: |
5c11c5204 [PATCH] kref: add... |
22 |
|
d6ac1c7e2 kref.txt: standar... |
23 24 |
struct my_data { |
5c11c5204 [PATCH] kref: add... |
25 26 27 28 29 |
. . struct kref refcount; . . |
d6ac1c7e2 kref.txt: standar... |
30 |
}; |
5c11c5204 [PATCH] kref: add... |
31 32 |
The kref can occur anywhere within the data structure. |
d6ac1c7e2 kref.txt: standar... |
33 34 |
Initialization ============== |
5c11c5204 [PATCH] kref: add... |
35 |
You must initialize the kref after you allocate it. To do this, call |
d6ac1c7e2 kref.txt: standar... |
36 |
kref_init as so:: |
5c11c5204 [PATCH] kref: add... |
37 38 39 40 41 42 43 44 45 |
struct my_data *data; data = kmalloc(sizeof(*data), GFP_KERNEL); if (!data) return -ENOMEM; kref_init(&data->refcount); This sets the refcount in the kref to 1. |
d6ac1c7e2 kref.txt: standar... |
46 47 |
Kref rules ========== |
5c11c5204 [PATCH] kref: add... |
48 49 50 51 52 |
Once you have an initialized kref, you must follow the following rules: 1) If you make a non-temporary copy of a pointer, especially if it can be passed to another thread of execution, you must |
d6ac1c7e2 kref.txt: standar... |
53 |
increment the refcount with kref_get() before passing it off:: |
5c11c5204 [PATCH] kref: add... |
54 |
kref_get(&data->refcount); |
d6ac1c7e2 kref.txt: standar... |
55 |
|
5c11c5204 [PATCH] kref: add... |
56 57 |
If you already have a valid pointer to a kref-ed structure (the refcount cannot go to zero) you may do this without a lock. |
d6ac1c7e2 kref.txt: standar... |
58 |
2) When you are done with a pointer, you must call kref_put():: |
5c11c5204 [PATCH] kref: add... |
59 |
kref_put(&data->refcount, data_release); |
d6ac1c7e2 kref.txt: standar... |
60 |
|
5c11c5204 [PATCH] kref: add... |
61 62 63 64 65 66 67 68 69 70 71 72 |
If this is the last reference to the pointer, the release routine will be called. If the code never tries to get a valid pointer to a kref-ed structure without already holding a valid pointer, it is safe to do this without a lock. 3) If the code attempts to gain a reference to a kref-ed structure without already holding a valid pointer, it must serialize access where a kref_put() cannot occur during the kref_get(), and the structure must remain valid during the kref_get(). For example, if you allocate some data and then pass it to another |
d6ac1c7e2 kref.txt: standar... |
73 |
thread to process:: |
5c11c5204 [PATCH] kref: add... |
74 |
|
d6ac1c7e2 kref.txt: standar... |
75 76 |
void data_release(struct kref *ref) { |
5c11c5204 [PATCH] kref: add... |
77 78 |
struct my_data *data = container_of(ref, struct my_data, refcount); kfree(data); |
d6ac1c7e2 kref.txt: standar... |
79 |
} |
5c11c5204 [PATCH] kref: add... |
80 |
|
d6ac1c7e2 kref.txt: standar... |
81 82 |
void more_data_handling(void *cb_data) { |
5c11c5204 [PATCH] kref: add... |
83 84 85 86 |
struct my_data *data = cb_data; . . do stuff with data here . |
b7cc4a879 Fix wrong identif... |
87 |
kref_put(&data->refcount, data_release); |
d6ac1c7e2 kref.txt: standar... |
88 |
} |
5c11c5204 [PATCH] kref: add... |
89 |
|
d6ac1c7e2 kref.txt: standar... |
90 91 |
int my_data_handler(void) { |
5c11c5204 [PATCH] kref: add... |
92 93 94 95 96 97 98 99 100 101 102 103 |
int rv = 0; struct my_data *data; struct task_struct *task; data = kmalloc(sizeof(*data), GFP_KERNEL); if (!data) return -ENOMEM; kref_init(&data->refcount); kref_get(&data->refcount); task = kthread_run(more_data_handling, data, "more_data_handling"); if (task == ERR_PTR(-ENOMEM)) { rv = -ENOMEM; |
fd0f50db2 Revert "kref: dou... |
104 |
kref_put(&data->refcount, data_release); |
5c11c5204 [PATCH] kref: add... |
105 106 107 108 109 110 |
goto out; } . . do stuff with data here . |
d6ac1c7e2 kref.txt: standar... |
111 |
out: |
5c11c5204 [PATCH] kref: add... |
112 113 |
kref_put(&data->refcount, data_release); return rv; |
d6ac1c7e2 kref.txt: standar... |
114 |
} |
5c11c5204 [PATCH] kref: add... |
115 116 117 118 119 120 121 122 123 |
This way, it doesn't matter what order the two threads handle the data, the kref_put() handles knowing when the data is not referenced any more and releasing it. The kref_get() does not require a lock, since we already have a valid pointer that we own a refcount for. The put needs no lock because nothing tries to get the data without already holding a pointer. Note that the "before" in rule 1 is very important. You should never |
d6ac1c7e2 kref.txt: standar... |
124 |
do something like:: |
5c11c5204 [PATCH] kref: add... |
125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 |
task = kthread_run(more_data_handling, data, "more_data_handling"); if (task == ERR_PTR(-ENOMEM)) { rv = -ENOMEM; goto out; } else /* BAD BAD BAD - get is after the handoff */ kref_get(&data->refcount); Don't assume you know what you are doing and use the above construct. First of all, you may not know what you are doing. Second, you may know what you are doing (there are some situations where locking is involved where the above may be legal) but someone else who doesn't know what they are doing may change the code or copy the code. It's bad style. Don't do it. There are some situations where you can optimize the gets and puts. For instance, if you are done with an object and enqueuing it for something else or passing it off to something else, there is no reason |
d6ac1c7e2 kref.txt: standar... |
144 |
to do a get then a put:: |
5c11c5204 [PATCH] kref: add... |
145 146 147 148 149 |
/* Silly extra get and put */ kref_get(&obj->ref); enqueue(obj); kref_put(&obj->ref, obj_cleanup); |
d6ac1c7e2 kref.txt: standar... |
150 |
Just do the enqueue. A comment about this is always welcome:: |
5c11c5204 [PATCH] kref: add... |
151 152 153 154 155 156 157 158 159 |
enqueue(obj); /* We are done with obj, so we pass our refcount off to the queue. DON'T TOUCH obj AFTER HERE! */ The last rule (rule 3) is the nastiest one to handle. Say, for instance, you have a list of items that are each kref-ed, and you wish to get the first one. You can't just pull the first item off the list and kref_get() it. That violates rule 3 because you are not already |
1373bed34 docs: convert kre... |
160 |
holding a valid pointer. You must add a mutex (or some other lock). |
d6ac1c7e2 kref.txt: standar... |
161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 |
For instance:: static DEFINE_MUTEX(mutex); static LIST_HEAD(q); struct my_data { struct kref refcount; struct list_head link; }; static struct my_data *get_entry() { struct my_data *entry = NULL; mutex_lock(&mutex); if (!list_empty(&q)) { entry = container_of(q.next, struct my_data, link); kref_get(&entry->refcount); } mutex_unlock(&mutex); return entry; |
5c11c5204 [PATCH] kref: add... |
181 |
} |
5c11c5204 [PATCH] kref: add... |
182 |
|
d6ac1c7e2 kref.txt: standar... |
183 184 185 |
static void release_entry(struct kref *ref) { struct my_data *entry = container_of(ref, struct my_data, refcount); |
5c11c5204 [PATCH] kref: add... |
186 |
|
d6ac1c7e2 kref.txt: standar... |
187 188 189 |
list_del(&entry->link); kfree(entry); } |
5c11c5204 [PATCH] kref: add... |
190 |
|
d6ac1c7e2 kref.txt: standar... |
191 192 193 194 195 196 |
static void put_entry(struct my_data *entry) { mutex_lock(&mutex); kref_put(&entry->refcount, release_entry); mutex_unlock(&mutex); } |
5c11c5204 [PATCH] kref: add... |
197 198 199 200 |
The kref_put() return value is useful if you do not want to hold the lock during the whole release operation. Say you didn't want to call kfree() with the lock held in the example above (since it is kind of |
d6ac1c7e2 kref.txt: standar... |
201 |
pointless to do so). You could use kref_put() as follows:: |
5c11c5204 [PATCH] kref: add... |
202 |
|
d6ac1c7e2 kref.txt: standar... |
203 204 205 206 |
static void release_entry(struct kref *ref) { /* All work is done after the return from kref_put(). */ } |
5c11c5204 [PATCH] kref: add... |
207 |
|
d6ac1c7e2 kref.txt: standar... |
208 209 210 211 212 213 214 215 216 217 |
static void put_entry(struct my_data *entry) { mutex_lock(&mutex); if (kref_put(&entry->refcount, release_entry)) { list_del(&entry->link); mutex_unlock(&mutex); kfree(entry); } else mutex_unlock(&mutex); } |
5c11c5204 [PATCH] kref: add... |
218 219 220 221 222 |
This is really more useful if you have to call other routines as part of the free operations that could take a long time or might claim the same lock. Note that doing everything in the release routine is still preferred as it is a little neater. |
a82b8db02 kref: Add kref_ge... |
223 |
The above example could also be optimized using kref_get_unless_zero() in |
d6ac1c7e2 kref.txt: standar... |
224 225 226 227 228 229 230 231 232 233 234 235 236 |
the following way:: static struct my_data *get_entry() { struct my_data *entry = NULL; mutex_lock(&mutex); if (!list_empty(&q)) { entry = container_of(q.next, struct my_data, link); if (!kref_get_unless_zero(&entry->refcount)) entry = NULL; } mutex_unlock(&mutex); return entry; |
a82b8db02 kref: Add kref_ge... |
237 |
} |
a82b8db02 kref: Add kref_ge... |
238 |
|
d6ac1c7e2 kref.txt: standar... |
239 240 241 |
static void release_entry(struct kref *ref) { struct my_data *entry = container_of(ref, struct my_data, refcount); |
a82b8db02 kref: Add kref_ge... |
242 |
|
d6ac1c7e2 kref.txt: standar... |
243 244 245 246 247 |
mutex_lock(&mutex); list_del(&entry->link); mutex_unlock(&mutex); kfree(entry); } |
a82b8db02 kref: Add kref_ge... |
248 |
|
d6ac1c7e2 kref.txt: standar... |
249 250 251 252 |
static void put_entry(struct my_data *entry) { kref_put(&entry->refcount, release_entry); } |
a82b8db02 kref: Add kref_ge... |
253 254 255 256 257 258 259 260 |
Which is useful to remove the mutex lock around kref_put() in put_entry(), but it's important that kref_get_unless_zero is enclosed in the same critical section that finds the entry in the lookup table, otherwise kref_get_unless_zero may reference already freed memory. Note that it is illegal to use kref_get_unless_zero without checking its return value. If you are sure (by already having a valid pointer) that kref_get_unless_zero() will return true, then use kref_get() instead. |
d6ac1c7e2 kref.txt: standar... |
261 262 |
Krefs and RCU ============= |
a82b8db02 kref: Add kref_ge... |
263 |
|
d6ac1c7e2 kref.txt: standar... |
264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 |
The function kref_get_unless_zero also makes it possible to use rcu locking for lookups in the above example:: struct my_data { struct rcu_head rhead; . struct kref refcount; . . }; static struct my_data *get_entry_rcu() { struct my_data *entry = NULL; rcu_read_lock(); if (!list_empty(&q)) { entry = container_of(q.next, struct my_data, link); if (!kref_get_unless_zero(&entry->refcount)) entry = NULL; } rcu_read_unlock(); return entry; |
a82b8db02 kref: Add kref_ge... |
287 |
} |
a82b8db02 kref: Add kref_ge... |
288 |
|
d6ac1c7e2 kref.txt: standar... |
289 290 291 |
static void release_entry_rcu(struct kref *ref) { struct my_data *entry = container_of(ref, struct my_data, refcount); |
a82b8db02 kref: Add kref_ge... |
292 |
|
d6ac1c7e2 kref.txt: standar... |
293 294 295 296 297 |
mutex_lock(&mutex); list_del_rcu(&entry->link); mutex_unlock(&mutex); kfree_rcu(entry, rhead); } |
a82b8db02 kref: Add kref_ge... |
298 |
|
d6ac1c7e2 kref.txt: standar... |
299 300 301 302 |
static void put_entry(struct my_data *entry) { kref_put(&entry->refcount, release_entry_rcu); } |
a82b8db02 kref: Add kref_ge... |
303 304 305 306 307 308 |
But note that the struct kref member needs to remain in valid memory for a rcu grace period after release_entry_rcu was called. That can be accomplished by using kfree_rcu(entry, rhead) as done above, or by calling synchronize_rcu() before using kfree, but note that synchronize_rcu() may sleep for a substantial amount of time. |