Commit 1f0532eb617d28f65c93593a1491f662f14f7eac

Authored by Nick Piggin
Committed by Pekka Enberg
1 parent 1eb5ac6466

mm: SLOB fix reclaim_state

SLOB does not correctly account reclaim_state.reclaimed_slab, so it will
break memory reclaim. Account it like SLAB does.

Cc: stable@kernel.org
Cc: linux-mm@kvack.org
Acked-by: Matt Mackall <mpm@selenic.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>

Showing 1 changed file with 4 additions and 1 deletions Inline Diff

1 /* 1 /*
2 * SLOB Allocator: Simple List Of Blocks 2 * SLOB Allocator: Simple List Of Blocks
3 * 3 *
4 * Matt Mackall <mpm@selenic.com> 12/30/03 4 * Matt Mackall <mpm@selenic.com> 12/30/03
5 * 5 *
6 * NUMA support by Paul Mundt, 2007. 6 * NUMA support by Paul Mundt, 2007.
7 * 7 *
8 * How SLOB works: 8 * How SLOB works:
9 * 9 *
10 * The core of SLOB is a traditional K&R style heap allocator, with 10 * The core of SLOB is a traditional K&R style heap allocator, with
11 * support for returning aligned objects. The granularity of this 11 * support for returning aligned objects. The granularity of this
12 * allocator is as little as 2 bytes, however typically most architectures 12 * allocator is as little as 2 bytes, however typically most architectures
13 * will require 4 bytes on 32-bit and 8 bytes on 64-bit. 13 * will require 4 bytes on 32-bit and 8 bytes on 64-bit.
14 * 14 *
15 * The slob heap is a set of linked list of pages from alloc_pages(), 15 * The slob heap is a set of linked list of pages from alloc_pages(),
16 * and within each page, there is a singly-linked list of free blocks 16 * and within each page, there is a singly-linked list of free blocks
17 * (slob_t). The heap is grown on demand. To reduce fragmentation, 17 * (slob_t). The heap is grown on demand. To reduce fragmentation,
18 * heap pages are segregated into three lists, with objects less than 18 * heap pages are segregated into three lists, with objects less than
19 * 256 bytes, objects less than 1024 bytes, and all other objects. 19 * 256 bytes, objects less than 1024 bytes, and all other objects.
20 * 20 *
21 * Allocation from heap involves first searching for a page with 21 * Allocation from heap involves first searching for a page with
22 * sufficient free blocks (using a next-fit-like approach) followed by 22 * sufficient free blocks (using a next-fit-like approach) followed by
23 * a first-fit scan of the page. Deallocation inserts objects back 23 * a first-fit scan of the page. Deallocation inserts objects back
24 * into the free list in address order, so this is effectively an 24 * into the free list in address order, so this is effectively an
25 * address-ordered first fit. 25 * address-ordered first fit.
26 * 26 *
27 * Above this is an implementation of kmalloc/kfree. Blocks returned 27 * Above this is an implementation of kmalloc/kfree. Blocks returned
28 * from kmalloc are prepended with a 4-byte header with the kmalloc size. 28 * from kmalloc are prepended with a 4-byte header with the kmalloc size.
29 * If kmalloc is asked for objects of PAGE_SIZE or larger, it calls 29 * If kmalloc is asked for objects of PAGE_SIZE or larger, it calls
30 * alloc_pages() directly, allocating compound pages so the page order 30 * alloc_pages() directly, allocating compound pages so the page order
31 * does not have to be separately tracked, and also stores the exact 31 * does not have to be separately tracked, and also stores the exact
32 * allocation size in page->private so that it can be used to accurately 32 * allocation size in page->private so that it can be used to accurately
33 * provide ksize(). These objects are detected in kfree() because slob_page() 33 * provide ksize(). These objects are detected in kfree() because slob_page()
34 * is false for them. 34 * is false for them.
35 * 35 *
36 * SLAB is emulated on top of SLOB by simply calling constructors and 36 * SLAB is emulated on top of SLOB by simply calling constructors and
37 * destructors for every SLAB allocation. Objects are returned with the 37 * destructors for every SLAB allocation. Objects are returned with the
38 * 4-byte alignment unless the SLAB_HWCACHE_ALIGN flag is set, in which 38 * 4-byte alignment unless the SLAB_HWCACHE_ALIGN flag is set, in which
39 * case the low-level allocator will fragment blocks to create the proper 39 * case the low-level allocator will fragment blocks to create the proper
40 * alignment. Again, objects of page-size or greater are allocated by 40 * alignment. Again, objects of page-size or greater are allocated by
41 * calling alloc_pages(). As SLAB objects know their size, no separate 41 * calling alloc_pages(). As SLAB objects know their size, no separate
42 * size bookkeeping is necessary and there is essentially no allocation 42 * size bookkeeping is necessary and there is essentially no allocation
43 * space overhead, and compound pages aren't needed for multi-page 43 * space overhead, and compound pages aren't needed for multi-page
44 * allocations. 44 * allocations.
45 * 45 *
46 * NUMA support in SLOB is fairly simplistic, pushing most of the real 46 * NUMA support in SLOB is fairly simplistic, pushing most of the real
47 * logic down to the page allocator, and simply doing the node accounting 47 * logic down to the page allocator, and simply doing the node accounting
48 * on the upper levels. In the event that a node id is explicitly 48 * on the upper levels. In the event that a node id is explicitly
49 * provided, alloc_pages_node() with the specified node id is used 49 * provided, alloc_pages_node() with the specified node id is used
50 * instead. The common case (or when the node id isn't explicitly provided) 50 * instead. The common case (or when the node id isn't explicitly provided)
51 * will default to the current node, as per numa_node_id(). 51 * will default to the current node, as per numa_node_id().
52 * 52 *
53 * Node aware pages are still inserted in to the global freelist, and 53 * Node aware pages are still inserted in to the global freelist, and
54 * these are scanned for by matching against the node id encoded in the 54 * these are scanned for by matching against the node id encoded in the
55 * page flags. As a result, block allocations that can be satisfied from 55 * page flags. As a result, block allocations that can be satisfied from
56 * the freelist will only be done so on pages residing on the same node, 56 * the freelist will only be done so on pages residing on the same node,
57 * in order to prevent random node placement. 57 * in order to prevent random node placement.
58 */ 58 */
59 59
60 #include <linux/kernel.h> 60 #include <linux/kernel.h>
61 #include <linux/slab.h> 61 #include <linux/slab.h>
62 #include <linux/mm.h> 62 #include <linux/mm.h>
63 #include <linux/swap.h> /* struct reclaim_state */
63 #include <linux/cache.h> 64 #include <linux/cache.h>
64 #include <linux/init.h> 65 #include <linux/init.h>
65 #include <linux/module.h> 66 #include <linux/module.h>
66 #include <linux/rcupdate.h> 67 #include <linux/rcupdate.h>
67 #include <linux/list.h> 68 #include <linux/list.h>
68 #include <trace/kmemtrace.h> 69 #include <trace/kmemtrace.h>
69 #include <asm/atomic.h> 70 #include <asm/atomic.h>
70 71
71 /* 72 /*
72 * slob_block has a field 'units', which indicates size of block if +ve, 73 * slob_block has a field 'units', which indicates size of block if +ve,
73 * or offset of next block if -ve (in SLOB_UNITs). 74 * or offset of next block if -ve (in SLOB_UNITs).
74 * 75 *
75 * Free blocks of size 1 unit simply contain the offset of the next block. 76 * Free blocks of size 1 unit simply contain the offset of the next block.
76 * Those with larger size contain their size in the first SLOB_UNIT of 77 * Those with larger size contain their size in the first SLOB_UNIT of
77 * memory, and the offset of the next free block in the second SLOB_UNIT. 78 * memory, and the offset of the next free block in the second SLOB_UNIT.
78 */ 79 */
79 #if PAGE_SIZE <= (32767 * 2) 80 #if PAGE_SIZE <= (32767 * 2)
80 typedef s16 slobidx_t; 81 typedef s16 slobidx_t;
81 #else 82 #else
82 typedef s32 slobidx_t; 83 typedef s32 slobidx_t;
83 #endif 84 #endif
84 85
85 struct slob_block { 86 struct slob_block {
86 slobidx_t units; 87 slobidx_t units;
87 }; 88 };
88 typedef struct slob_block slob_t; 89 typedef struct slob_block slob_t;
89 90
90 /* 91 /*
91 * We use struct page fields to manage some slob allocation aspects, 92 * We use struct page fields to manage some slob allocation aspects,
92 * however to avoid the horrible mess in include/linux/mm_types.h, we'll 93 * however to avoid the horrible mess in include/linux/mm_types.h, we'll
93 * just define our own struct page type variant here. 94 * just define our own struct page type variant here.
94 */ 95 */
95 struct slob_page { 96 struct slob_page {
96 union { 97 union {
97 struct { 98 struct {
98 unsigned long flags; /* mandatory */ 99 unsigned long flags; /* mandatory */
99 atomic_t _count; /* mandatory */ 100 atomic_t _count; /* mandatory */
100 slobidx_t units; /* free units left in page */ 101 slobidx_t units; /* free units left in page */
101 unsigned long pad[2]; 102 unsigned long pad[2];
102 slob_t *free; /* first free slob_t in page */ 103 slob_t *free; /* first free slob_t in page */
103 struct list_head list; /* linked list of free pages */ 104 struct list_head list; /* linked list of free pages */
104 }; 105 };
105 struct page page; 106 struct page page;
106 }; 107 };
107 }; 108 };
108 static inline void struct_slob_page_wrong_size(void) 109 static inline void struct_slob_page_wrong_size(void)
109 { BUILD_BUG_ON(sizeof(struct slob_page) != sizeof(struct page)); } 110 { BUILD_BUG_ON(sizeof(struct slob_page) != sizeof(struct page)); }
110 111
111 /* 112 /*
112 * free_slob_page: call before a slob_page is returned to the page allocator. 113 * free_slob_page: call before a slob_page is returned to the page allocator.
113 */ 114 */
114 static inline void free_slob_page(struct slob_page *sp) 115 static inline void free_slob_page(struct slob_page *sp)
115 { 116 {
116 reset_page_mapcount(&sp->page); 117 reset_page_mapcount(&sp->page);
117 sp->page.mapping = NULL; 118 sp->page.mapping = NULL;
118 } 119 }
119 120
120 /* 121 /*
121 * All partially free slob pages go on these lists. 122 * All partially free slob pages go on these lists.
122 */ 123 */
123 #define SLOB_BREAK1 256 124 #define SLOB_BREAK1 256
124 #define SLOB_BREAK2 1024 125 #define SLOB_BREAK2 1024
125 static LIST_HEAD(free_slob_small); 126 static LIST_HEAD(free_slob_small);
126 static LIST_HEAD(free_slob_medium); 127 static LIST_HEAD(free_slob_medium);
127 static LIST_HEAD(free_slob_large); 128 static LIST_HEAD(free_slob_large);
128 129
129 /* 130 /*
130 * is_slob_page: True for all slob pages (false for bigblock pages) 131 * is_slob_page: True for all slob pages (false for bigblock pages)
131 */ 132 */
132 static inline int is_slob_page(struct slob_page *sp) 133 static inline int is_slob_page(struct slob_page *sp)
133 { 134 {
134 return PageSlobPage((struct page *)sp); 135 return PageSlobPage((struct page *)sp);
135 } 136 }
136 137
137 static inline void set_slob_page(struct slob_page *sp) 138 static inline void set_slob_page(struct slob_page *sp)
138 { 139 {
139 __SetPageSlobPage((struct page *)sp); 140 __SetPageSlobPage((struct page *)sp);
140 } 141 }
141 142
142 static inline void clear_slob_page(struct slob_page *sp) 143 static inline void clear_slob_page(struct slob_page *sp)
143 { 144 {
144 __ClearPageSlobPage((struct page *)sp); 145 __ClearPageSlobPage((struct page *)sp);
145 } 146 }
146 147
147 static inline struct slob_page *slob_page(const void *addr) 148 static inline struct slob_page *slob_page(const void *addr)
148 { 149 {
149 return (struct slob_page *)virt_to_page(addr); 150 return (struct slob_page *)virt_to_page(addr);
150 } 151 }
151 152
152 /* 153 /*
153 * slob_page_free: true for pages on free_slob_pages list. 154 * slob_page_free: true for pages on free_slob_pages list.
154 */ 155 */
155 static inline int slob_page_free(struct slob_page *sp) 156 static inline int slob_page_free(struct slob_page *sp)
156 { 157 {
157 return PageSlobFree((struct page *)sp); 158 return PageSlobFree((struct page *)sp);
158 } 159 }
159 160
160 static void set_slob_page_free(struct slob_page *sp, struct list_head *list) 161 static void set_slob_page_free(struct slob_page *sp, struct list_head *list)
161 { 162 {
162 list_add(&sp->list, list); 163 list_add(&sp->list, list);
163 __SetPageSlobFree((struct page *)sp); 164 __SetPageSlobFree((struct page *)sp);
164 } 165 }
165 166
166 static inline void clear_slob_page_free(struct slob_page *sp) 167 static inline void clear_slob_page_free(struct slob_page *sp)
167 { 168 {
168 list_del(&sp->list); 169 list_del(&sp->list);
169 __ClearPageSlobFree((struct page *)sp); 170 __ClearPageSlobFree((struct page *)sp);
170 } 171 }
171 172
172 #define SLOB_UNIT sizeof(slob_t) 173 #define SLOB_UNIT sizeof(slob_t)
173 #define SLOB_UNITS(size) (((size) + SLOB_UNIT - 1)/SLOB_UNIT) 174 #define SLOB_UNITS(size) (((size) + SLOB_UNIT - 1)/SLOB_UNIT)
174 #define SLOB_ALIGN L1_CACHE_BYTES 175 #define SLOB_ALIGN L1_CACHE_BYTES
175 176
176 /* 177 /*
177 * struct slob_rcu is inserted at the tail of allocated slob blocks, which 178 * struct slob_rcu is inserted at the tail of allocated slob blocks, which
178 * were created with a SLAB_DESTROY_BY_RCU slab. slob_rcu is used to free 179 * were created with a SLAB_DESTROY_BY_RCU slab. slob_rcu is used to free
179 * the block using call_rcu. 180 * the block using call_rcu.
180 */ 181 */
181 struct slob_rcu { 182 struct slob_rcu {
182 struct rcu_head head; 183 struct rcu_head head;
183 int size; 184 int size;
184 }; 185 };
185 186
186 /* 187 /*
187 * slob_lock protects all slob allocator structures. 188 * slob_lock protects all slob allocator structures.
188 */ 189 */
189 static DEFINE_SPINLOCK(slob_lock); 190 static DEFINE_SPINLOCK(slob_lock);
190 191
191 /* 192 /*
192 * Encode the given size and next info into a free slob block s. 193 * Encode the given size and next info into a free slob block s.
193 */ 194 */
194 static void set_slob(slob_t *s, slobidx_t size, slob_t *next) 195 static void set_slob(slob_t *s, slobidx_t size, slob_t *next)
195 { 196 {
196 slob_t *base = (slob_t *)((unsigned long)s & PAGE_MASK); 197 slob_t *base = (slob_t *)((unsigned long)s & PAGE_MASK);
197 slobidx_t offset = next - base; 198 slobidx_t offset = next - base;
198 199
199 if (size > 1) { 200 if (size > 1) {
200 s[0].units = size; 201 s[0].units = size;
201 s[1].units = offset; 202 s[1].units = offset;
202 } else 203 } else
203 s[0].units = -offset; 204 s[0].units = -offset;
204 } 205 }
205 206
206 /* 207 /*
207 * Return the size of a slob block. 208 * Return the size of a slob block.
208 */ 209 */
209 static slobidx_t slob_units(slob_t *s) 210 static slobidx_t slob_units(slob_t *s)
210 { 211 {
211 if (s->units > 0) 212 if (s->units > 0)
212 return s->units; 213 return s->units;
213 return 1; 214 return 1;
214 } 215 }
215 216
216 /* 217 /*
217 * Return the next free slob block pointer after this one. 218 * Return the next free slob block pointer after this one.
218 */ 219 */
219 static slob_t *slob_next(slob_t *s) 220 static slob_t *slob_next(slob_t *s)
220 { 221 {
221 slob_t *base = (slob_t *)((unsigned long)s & PAGE_MASK); 222 slob_t *base = (slob_t *)((unsigned long)s & PAGE_MASK);
222 slobidx_t next; 223 slobidx_t next;
223 224
224 if (s[0].units < 0) 225 if (s[0].units < 0)
225 next = -s[0].units; 226 next = -s[0].units;
226 else 227 else
227 next = s[1].units; 228 next = s[1].units;
228 return base+next; 229 return base+next;
229 } 230 }
230 231
231 /* 232 /*
232 * Returns true if s is the last free block in its page. 233 * Returns true if s is the last free block in its page.
233 */ 234 */
234 static int slob_last(slob_t *s) 235 static int slob_last(slob_t *s)
235 { 236 {
236 return !((unsigned long)slob_next(s) & ~PAGE_MASK); 237 return !((unsigned long)slob_next(s) & ~PAGE_MASK);
237 } 238 }
238 239
239 static void *slob_new_pages(gfp_t gfp, int order, int node) 240 static void *slob_new_pages(gfp_t gfp, int order, int node)
240 { 241 {
241 void *page; 242 void *page;
242 243
243 #ifdef CONFIG_NUMA 244 #ifdef CONFIG_NUMA
244 if (node != -1) 245 if (node != -1)
245 page = alloc_pages_node(node, gfp, order); 246 page = alloc_pages_node(node, gfp, order);
246 else 247 else
247 #endif 248 #endif
248 page = alloc_pages(gfp, order); 249 page = alloc_pages(gfp, order);
249 250
250 if (!page) 251 if (!page)
251 return NULL; 252 return NULL;
252 253
253 return page_address(page); 254 return page_address(page);
254 } 255 }
255 256
256 static void slob_free_pages(void *b, int order) 257 static void slob_free_pages(void *b, int order)
257 { 258 {
259 if (current->reclaim_state)
260 current->reclaim_state->reclaimed_slab += 1 << order;
258 free_pages((unsigned long)b, order); 261 free_pages((unsigned long)b, order);
259 } 262 }
260 263
261 /* 264 /*
262 * Allocate a slob block within a given slob_page sp. 265 * Allocate a slob block within a given slob_page sp.
263 */ 266 */
264 static void *slob_page_alloc(struct slob_page *sp, size_t size, int align) 267 static void *slob_page_alloc(struct slob_page *sp, size_t size, int align)
265 { 268 {
266 slob_t *prev, *cur, *aligned = NULL; 269 slob_t *prev, *cur, *aligned = NULL;
267 int delta = 0, units = SLOB_UNITS(size); 270 int delta = 0, units = SLOB_UNITS(size);
268 271
269 for (prev = NULL, cur = sp->free; ; prev = cur, cur = slob_next(cur)) { 272 for (prev = NULL, cur = sp->free; ; prev = cur, cur = slob_next(cur)) {
270 slobidx_t avail = slob_units(cur); 273 slobidx_t avail = slob_units(cur);
271 274
272 if (align) { 275 if (align) {
273 aligned = (slob_t *)ALIGN((unsigned long)cur, align); 276 aligned = (slob_t *)ALIGN((unsigned long)cur, align);
274 delta = aligned - cur; 277 delta = aligned - cur;
275 } 278 }
276 if (avail >= units + delta) { /* room enough? */ 279 if (avail >= units + delta) { /* room enough? */
277 slob_t *next; 280 slob_t *next;
278 281
279 if (delta) { /* need to fragment head to align? */ 282 if (delta) { /* need to fragment head to align? */
280 next = slob_next(cur); 283 next = slob_next(cur);
281 set_slob(aligned, avail - delta, next); 284 set_slob(aligned, avail - delta, next);
282 set_slob(cur, delta, aligned); 285 set_slob(cur, delta, aligned);
283 prev = cur; 286 prev = cur;
284 cur = aligned; 287 cur = aligned;
285 avail = slob_units(cur); 288 avail = slob_units(cur);
286 } 289 }
287 290
288 next = slob_next(cur); 291 next = slob_next(cur);
289 if (avail == units) { /* exact fit? unlink. */ 292 if (avail == units) { /* exact fit? unlink. */
290 if (prev) 293 if (prev)
291 set_slob(prev, slob_units(prev), next); 294 set_slob(prev, slob_units(prev), next);
292 else 295 else
293 sp->free = next; 296 sp->free = next;
294 } else { /* fragment */ 297 } else { /* fragment */
295 if (prev) 298 if (prev)
296 set_slob(prev, slob_units(prev), cur + units); 299 set_slob(prev, slob_units(prev), cur + units);
297 else 300 else
298 sp->free = cur + units; 301 sp->free = cur + units;
299 set_slob(cur + units, avail - units, next); 302 set_slob(cur + units, avail - units, next);
300 } 303 }
301 304
302 sp->units -= units; 305 sp->units -= units;
303 if (!sp->units) 306 if (!sp->units)
304 clear_slob_page_free(sp); 307 clear_slob_page_free(sp);
305 return cur; 308 return cur;
306 } 309 }
307 if (slob_last(cur)) 310 if (slob_last(cur))
308 return NULL; 311 return NULL;
309 } 312 }
310 } 313 }
311 314
312 /* 315 /*
313 * slob_alloc: entry point into the slob allocator. 316 * slob_alloc: entry point into the slob allocator.
314 */ 317 */
315 static void *slob_alloc(size_t size, gfp_t gfp, int align, int node) 318 static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)
316 { 319 {
317 struct slob_page *sp; 320 struct slob_page *sp;
318 struct list_head *prev; 321 struct list_head *prev;
319 struct list_head *slob_list; 322 struct list_head *slob_list;
320 slob_t *b = NULL; 323 slob_t *b = NULL;
321 unsigned long flags; 324 unsigned long flags;
322 325
323 if (size < SLOB_BREAK1) 326 if (size < SLOB_BREAK1)
324 slob_list = &free_slob_small; 327 slob_list = &free_slob_small;
325 else if (size < SLOB_BREAK2) 328 else if (size < SLOB_BREAK2)
326 slob_list = &free_slob_medium; 329 slob_list = &free_slob_medium;
327 else 330 else
328 slob_list = &free_slob_large; 331 slob_list = &free_slob_large;
329 332
330 spin_lock_irqsave(&slob_lock, flags); 333 spin_lock_irqsave(&slob_lock, flags);
331 /* Iterate through each partially free page, try to find room */ 334 /* Iterate through each partially free page, try to find room */
332 list_for_each_entry(sp, slob_list, list) { 335 list_for_each_entry(sp, slob_list, list) {
333 #ifdef CONFIG_NUMA 336 #ifdef CONFIG_NUMA
334 /* 337 /*
335 * If there's a node specification, search for a partial 338 * If there's a node specification, search for a partial
336 * page with a matching node id in the freelist. 339 * page with a matching node id in the freelist.
337 */ 340 */
338 if (node != -1 && page_to_nid(&sp->page) != node) 341 if (node != -1 && page_to_nid(&sp->page) != node)
339 continue; 342 continue;
340 #endif 343 #endif
341 /* Enough room on this page? */ 344 /* Enough room on this page? */
342 if (sp->units < SLOB_UNITS(size)) 345 if (sp->units < SLOB_UNITS(size))
343 continue; 346 continue;
344 347
345 /* Attempt to alloc */ 348 /* Attempt to alloc */
346 prev = sp->list.prev; 349 prev = sp->list.prev;
347 b = slob_page_alloc(sp, size, align); 350 b = slob_page_alloc(sp, size, align);
348 if (!b) 351 if (!b)
349 continue; 352 continue;
350 353
351 /* Improve fragment distribution and reduce our average 354 /* Improve fragment distribution and reduce our average
352 * search time by starting our next search here. (see 355 * search time by starting our next search here. (see
353 * Knuth vol 1, sec 2.5, pg 449) */ 356 * Knuth vol 1, sec 2.5, pg 449) */
354 if (prev != slob_list->prev && 357 if (prev != slob_list->prev &&
355 slob_list->next != prev->next) 358 slob_list->next != prev->next)
356 list_move_tail(slob_list, prev->next); 359 list_move_tail(slob_list, prev->next);
357 break; 360 break;
358 } 361 }
359 spin_unlock_irqrestore(&slob_lock, flags); 362 spin_unlock_irqrestore(&slob_lock, flags);
360 363
361 /* Not enough space: must allocate a new page */ 364 /* Not enough space: must allocate a new page */
362 if (!b) { 365 if (!b) {
363 b = slob_new_pages(gfp & ~__GFP_ZERO, 0, node); 366 b = slob_new_pages(gfp & ~__GFP_ZERO, 0, node);
364 if (!b) 367 if (!b)
365 return NULL; 368 return NULL;
366 sp = slob_page(b); 369 sp = slob_page(b);
367 set_slob_page(sp); 370 set_slob_page(sp);
368 371
369 spin_lock_irqsave(&slob_lock, flags); 372 spin_lock_irqsave(&slob_lock, flags);
370 sp->units = SLOB_UNITS(PAGE_SIZE); 373 sp->units = SLOB_UNITS(PAGE_SIZE);
371 sp->free = b; 374 sp->free = b;
372 INIT_LIST_HEAD(&sp->list); 375 INIT_LIST_HEAD(&sp->list);
373 set_slob(b, SLOB_UNITS(PAGE_SIZE), b + SLOB_UNITS(PAGE_SIZE)); 376 set_slob(b, SLOB_UNITS(PAGE_SIZE), b + SLOB_UNITS(PAGE_SIZE));
374 set_slob_page_free(sp, slob_list); 377 set_slob_page_free(sp, slob_list);
375 b = slob_page_alloc(sp, size, align); 378 b = slob_page_alloc(sp, size, align);
376 BUG_ON(!b); 379 BUG_ON(!b);
377 spin_unlock_irqrestore(&slob_lock, flags); 380 spin_unlock_irqrestore(&slob_lock, flags);
378 } 381 }
379 if (unlikely((gfp & __GFP_ZERO) && b)) 382 if (unlikely((gfp & __GFP_ZERO) && b))
380 memset(b, 0, size); 383 memset(b, 0, size);
381 return b; 384 return b;
382 } 385 }
383 386
384 /* 387 /*
385 * slob_free: entry point into the slob allocator. 388 * slob_free: entry point into the slob allocator.
386 */ 389 */
387 static void slob_free(void *block, int size) 390 static void slob_free(void *block, int size)
388 { 391 {
389 struct slob_page *sp; 392 struct slob_page *sp;
390 slob_t *prev, *next, *b = (slob_t *)block; 393 slob_t *prev, *next, *b = (slob_t *)block;
391 slobidx_t units; 394 slobidx_t units;
392 unsigned long flags; 395 unsigned long flags;
393 396
394 if (unlikely(ZERO_OR_NULL_PTR(block))) 397 if (unlikely(ZERO_OR_NULL_PTR(block)))
395 return; 398 return;
396 BUG_ON(!size); 399 BUG_ON(!size);
397 400
398 sp = slob_page(block); 401 sp = slob_page(block);
399 units = SLOB_UNITS(size); 402 units = SLOB_UNITS(size);
400 403
401 spin_lock_irqsave(&slob_lock, flags); 404 spin_lock_irqsave(&slob_lock, flags);
402 405
403 if (sp->units + units == SLOB_UNITS(PAGE_SIZE)) { 406 if (sp->units + units == SLOB_UNITS(PAGE_SIZE)) {
404 /* Go directly to page allocator. Do not pass slob allocator */ 407 /* Go directly to page allocator. Do not pass slob allocator */
405 if (slob_page_free(sp)) 408 if (slob_page_free(sp))
406 clear_slob_page_free(sp); 409 clear_slob_page_free(sp);
407 spin_unlock_irqrestore(&slob_lock, flags); 410 spin_unlock_irqrestore(&slob_lock, flags);
408 clear_slob_page(sp); 411 clear_slob_page(sp);
409 free_slob_page(sp); 412 free_slob_page(sp);
410 free_page((unsigned long)b); 413 slob_free_pages(b, 0);
411 return; 414 return;
412 } 415 }
413 416
414 if (!slob_page_free(sp)) { 417 if (!slob_page_free(sp)) {
415 /* This slob page is about to become partially free. Easy! */ 418 /* This slob page is about to become partially free. Easy! */
416 sp->units = units; 419 sp->units = units;
417 sp->free = b; 420 sp->free = b;
418 set_slob(b, units, 421 set_slob(b, units,
419 (void *)((unsigned long)(b + 422 (void *)((unsigned long)(b +
420 SLOB_UNITS(PAGE_SIZE)) & PAGE_MASK)); 423 SLOB_UNITS(PAGE_SIZE)) & PAGE_MASK));
421 set_slob_page_free(sp, &free_slob_small); 424 set_slob_page_free(sp, &free_slob_small);
422 goto out; 425 goto out;
423 } 426 }
424 427
425 /* 428 /*
426 * Otherwise the page is already partially free, so find reinsertion 429 * Otherwise the page is already partially free, so find reinsertion
427 * point. 430 * point.
428 */ 431 */
429 sp->units += units; 432 sp->units += units;
430 433
431 if (b < sp->free) { 434 if (b < sp->free) {
432 if (b + units == sp->free) { 435 if (b + units == sp->free) {
433 units += slob_units(sp->free); 436 units += slob_units(sp->free);
434 sp->free = slob_next(sp->free); 437 sp->free = slob_next(sp->free);
435 } 438 }
436 set_slob(b, units, sp->free); 439 set_slob(b, units, sp->free);
437 sp->free = b; 440 sp->free = b;
438 } else { 441 } else {
439 prev = sp->free; 442 prev = sp->free;
440 next = slob_next(prev); 443 next = slob_next(prev);
441 while (b > next) { 444 while (b > next) {
442 prev = next; 445 prev = next;
443 next = slob_next(prev); 446 next = slob_next(prev);
444 } 447 }
445 448
446 if (!slob_last(prev) && b + units == next) { 449 if (!slob_last(prev) && b + units == next) {
447 units += slob_units(next); 450 units += slob_units(next);
448 set_slob(b, units, slob_next(next)); 451 set_slob(b, units, slob_next(next));
449 } else 452 } else
450 set_slob(b, units, next); 453 set_slob(b, units, next);
451 454
452 if (prev + slob_units(prev) == b) { 455 if (prev + slob_units(prev) == b) {
453 units = slob_units(b) + slob_units(prev); 456 units = slob_units(b) + slob_units(prev);
454 set_slob(prev, units, slob_next(b)); 457 set_slob(prev, units, slob_next(b));
455 } else 458 } else
456 set_slob(prev, slob_units(prev), b); 459 set_slob(prev, slob_units(prev), b);
457 } 460 }
458 out: 461 out:
459 spin_unlock_irqrestore(&slob_lock, flags); 462 spin_unlock_irqrestore(&slob_lock, flags);
460 } 463 }
461 464
462 /* 465 /*
463 * End of slob allocator proper. Begin kmem_cache_alloc and kmalloc frontend. 466 * End of slob allocator proper. Begin kmem_cache_alloc and kmalloc frontend.
464 */ 467 */
465 468
466 #ifndef ARCH_KMALLOC_MINALIGN 469 #ifndef ARCH_KMALLOC_MINALIGN
467 #define ARCH_KMALLOC_MINALIGN __alignof__(unsigned long) 470 #define ARCH_KMALLOC_MINALIGN __alignof__(unsigned long)
468 #endif 471 #endif
469 472
470 #ifndef ARCH_SLAB_MINALIGN 473 #ifndef ARCH_SLAB_MINALIGN
471 #define ARCH_SLAB_MINALIGN __alignof__(unsigned long) 474 #define ARCH_SLAB_MINALIGN __alignof__(unsigned long)
472 #endif 475 #endif
473 476
474 void *__kmalloc_node(size_t size, gfp_t gfp, int node) 477 void *__kmalloc_node(size_t size, gfp_t gfp, int node)
475 { 478 {
476 unsigned int *m; 479 unsigned int *m;
477 int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); 480 int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
478 void *ret; 481 void *ret;
479 482
480 lockdep_trace_alloc(gfp); 483 lockdep_trace_alloc(gfp);
481 484
482 if (size < PAGE_SIZE - align) { 485 if (size < PAGE_SIZE - align) {
483 if (!size) 486 if (!size)
484 return ZERO_SIZE_PTR; 487 return ZERO_SIZE_PTR;
485 488
486 m = slob_alloc(size + align, gfp, align, node); 489 m = slob_alloc(size + align, gfp, align, node);
487 490
488 if (!m) 491 if (!m)
489 return NULL; 492 return NULL;
490 *m = size; 493 *m = size;
491 ret = (void *)m + align; 494 ret = (void *)m + align;
492 495
493 trace_kmalloc_node(_RET_IP_, ret, 496 trace_kmalloc_node(_RET_IP_, ret,
494 size, size + align, gfp, node); 497 size, size + align, gfp, node);
495 } else { 498 } else {
496 unsigned int order = get_order(size); 499 unsigned int order = get_order(size);
497 500
498 ret = slob_new_pages(gfp | __GFP_COMP, get_order(size), node); 501 ret = slob_new_pages(gfp | __GFP_COMP, get_order(size), node);
499 if (ret) { 502 if (ret) {
500 struct page *page; 503 struct page *page;
501 page = virt_to_page(ret); 504 page = virt_to_page(ret);
502 page->private = size; 505 page->private = size;
503 } 506 }
504 507
505 trace_kmalloc_node(_RET_IP_, ret, 508 trace_kmalloc_node(_RET_IP_, ret,
506 size, PAGE_SIZE << order, gfp, node); 509 size, PAGE_SIZE << order, gfp, node);
507 } 510 }
508 511
509 return ret; 512 return ret;
510 } 513 }
511 EXPORT_SYMBOL(__kmalloc_node); 514 EXPORT_SYMBOL(__kmalloc_node);
512 515
513 void kfree(const void *block) 516 void kfree(const void *block)
514 { 517 {
515 struct slob_page *sp; 518 struct slob_page *sp;
516 519
517 trace_kfree(_RET_IP_, block); 520 trace_kfree(_RET_IP_, block);
518 521
519 if (unlikely(ZERO_OR_NULL_PTR(block))) 522 if (unlikely(ZERO_OR_NULL_PTR(block)))
520 return; 523 return;
521 524
522 sp = slob_page(block); 525 sp = slob_page(block);
523 if (is_slob_page(sp)) { 526 if (is_slob_page(sp)) {
524 int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); 527 int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
525 unsigned int *m = (unsigned int *)(block - align); 528 unsigned int *m = (unsigned int *)(block - align);
526 slob_free(m, *m + align); 529 slob_free(m, *m + align);
527 } else 530 } else
528 put_page(&sp->page); 531 put_page(&sp->page);
529 } 532 }
530 EXPORT_SYMBOL(kfree); 533 EXPORT_SYMBOL(kfree);
531 534
532 /* can't use ksize for kmem_cache_alloc memory, only kmalloc */ 535 /* can't use ksize for kmem_cache_alloc memory, only kmalloc */
533 size_t ksize(const void *block) 536 size_t ksize(const void *block)
534 { 537 {
535 struct slob_page *sp; 538 struct slob_page *sp;
536 539
537 BUG_ON(!block); 540 BUG_ON(!block);
538 if (unlikely(block == ZERO_SIZE_PTR)) 541 if (unlikely(block == ZERO_SIZE_PTR))
539 return 0; 542 return 0;
540 543
541 sp = slob_page(block); 544 sp = slob_page(block);
542 if (is_slob_page(sp)) { 545 if (is_slob_page(sp)) {
543 int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); 546 int align = max(ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
544 unsigned int *m = (unsigned int *)(block - align); 547 unsigned int *m = (unsigned int *)(block - align);
545 return SLOB_UNITS(*m) * SLOB_UNIT; 548 return SLOB_UNITS(*m) * SLOB_UNIT;
546 } else 549 } else
547 return sp->page.private; 550 return sp->page.private;
548 } 551 }
549 EXPORT_SYMBOL(ksize); 552 EXPORT_SYMBOL(ksize);
550 553
551 struct kmem_cache { 554 struct kmem_cache {
552 unsigned int size, align; 555 unsigned int size, align;
553 unsigned long flags; 556 unsigned long flags;
554 const char *name; 557 const char *name;
555 void (*ctor)(void *); 558 void (*ctor)(void *);
556 }; 559 };
557 560
558 struct kmem_cache *kmem_cache_create(const char *name, size_t size, 561 struct kmem_cache *kmem_cache_create(const char *name, size_t size,
559 size_t align, unsigned long flags, void (*ctor)(void *)) 562 size_t align, unsigned long flags, void (*ctor)(void *))
560 { 563 {
561 struct kmem_cache *c; 564 struct kmem_cache *c;
562 565
563 c = slob_alloc(sizeof(struct kmem_cache), 566 c = slob_alloc(sizeof(struct kmem_cache),
564 GFP_KERNEL, ARCH_KMALLOC_MINALIGN, -1); 567 GFP_KERNEL, ARCH_KMALLOC_MINALIGN, -1);
565 568
566 if (c) { 569 if (c) {
567 c->name = name; 570 c->name = name;
568 c->size = size; 571 c->size = size;
569 if (flags & SLAB_DESTROY_BY_RCU) { 572 if (flags & SLAB_DESTROY_BY_RCU) {
570 /* leave room for rcu footer at the end of object */ 573 /* leave room for rcu footer at the end of object */
571 c->size += sizeof(struct slob_rcu); 574 c->size += sizeof(struct slob_rcu);
572 } 575 }
573 c->flags = flags; 576 c->flags = flags;
574 c->ctor = ctor; 577 c->ctor = ctor;
575 /* ignore alignment unless it's forced */ 578 /* ignore alignment unless it's forced */
576 c->align = (flags & SLAB_HWCACHE_ALIGN) ? SLOB_ALIGN : 0; 579 c->align = (flags & SLAB_HWCACHE_ALIGN) ? SLOB_ALIGN : 0;
577 if (c->align < ARCH_SLAB_MINALIGN) 580 if (c->align < ARCH_SLAB_MINALIGN)
578 c->align = ARCH_SLAB_MINALIGN; 581 c->align = ARCH_SLAB_MINALIGN;
579 if (c->align < align) 582 if (c->align < align)
580 c->align = align; 583 c->align = align;
581 } else if (flags & SLAB_PANIC) 584 } else if (flags & SLAB_PANIC)
582 panic("Cannot create slab cache %s\n", name); 585 panic("Cannot create slab cache %s\n", name);
583 586
584 return c; 587 return c;
585 } 588 }
586 EXPORT_SYMBOL(kmem_cache_create); 589 EXPORT_SYMBOL(kmem_cache_create);
587 590
588 void kmem_cache_destroy(struct kmem_cache *c) 591 void kmem_cache_destroy(struct kmem_cache *c)
589 { 592 {
590 slob_free(c, sizeof(struct kmem_cache)); 593 slob_free(c, sizeof(struct kmem_cache));
591 } 594 }
592 EXPORT_SYMBOL(kmem_cache_destroy); 595 EXPORT_SYMBOL(kmem_cache_destroy);
593 596
594 void *kmem_cache_alloc_node(struct kmem_cache *c, gfp_t flags, int node) 597 void *kmem_cache_alloc_node(struct kmem_cache *c, gfp_t flags, int node)
595 { 598 {
596 void *b; 599 void *b;
597 600
598 if (c->size < PAGE_SIZE) { 601 if (c->size < PAGE_SIZE) {
599 b = slob_alloc(c->size, flags, c->align, node); 602 b = slob_alloc(c->size, flags, c->align, node);
600 trace_kmem_cache_alloc_node(_RET_IP_, b, c->size, 603 trace_kmem_cache_alloc_node(_RET_IP_, b, c->size,
601 SLOB_UNITS(c->size) * SLOB_UNIT, 604 SLOB_UNITS(c->size) * SLOB_UNIT,
602 flags, node); 605 flags, node);
603 } else { 606 } else {
604 b = slob_new_pages(flags, get_order(c->size), node); 607 b = slob_new_pages(flags, get_order(c->size), node);
605 trace_kmem_cache_alloc_node(_RET_IP_, b, c->size, 608 trace_kmem_cache_alloc_node(_RET_IP_, b, c->size,
606 PAGE_SIZE << get_order(c->size), 609 PAGE_SIZE << get_order(c->size),
607 flags, node); 610 flags, node);
608 } 611 }
609 612
610 if (c->ctor) 613 if (c->ctor)
611 c->ctor(b); 614 c->ctor(b);
612 615
613 return b; 616 return b;
614 } 617 }
615 EXPORT_SYMBOL(kmem_cache_alloc_node); 618 EXPORT_SYMBOL(kmem_cache_alloc_node);
616 619
617 static void __kmem_cache_free(void *b, int size) 620 static void __kmem_cache_free(void *b, int size)
618 { 621 {
619 if (size < PAGE_SIZE) 622 if (size < PAGE_SIZE)
620 slob_free(b, size); 623 slob_free(b, size);
621 else 624 else
622 slob_free_pages(b, get_order(size)); 625 slob_free_pages(b, get_order(size));
623 } 626 }
624 627
625 static void kmem_rcu_free(struct rcu_head *head) 628 static void kmem_rcu_free(struct rcu_head *head)
626 { 629 {
627 struct slob_rcu *slob_rcu = (struct slob_rcu *)head; 630 struct slob_rcu *slob_rcu = (struct slob_rcu *)head;
628 void *b = (void *)slob_rcu - (slob_rcu->size - sizeof(struct slob_rcu)); 631 void *b = (void *)slob_rcu - (slob_rcu->size - sizeof(struct slob_rcu));
629 632
630 __kmem_cache_free(b, slob_rcu->size); 633 __kmem_cache_free(b, slob_rcu->size);
631 } 634 }
632 635
633 void kmem_cache_free(struct kmem_cache *c, void *b) 636 void kmem_cache_free(struct kmem_cache *c, void *b)
634 { 637 {
635 if (unlikely(c->flags & SLAB_DESTROY_BY_RCU)) { 638 if (unlikely(c->flags & SLAB_DESTROY_BY_RCU)) {
636 struct slob_rcu *slob_rcu; 639 struct slob_rcu *slob_rcu;
637 slob_rcu = b + (c->size - sizeof(struct slob_rcu)); 640 slob_rcu = b + (c->size - sizeof(struct slob_rcu));
638 INIT_RCU_HEAD(&slob_rcu->head); 641 INIT_RCU_HEAD(&slob_rcu->head);
639 slob_rcu->size = c->size; 642 slob_rcu->size = c->size;
640 call_rcu(&slob_rcu->head, kmem_rcu_free); 643 call_rcu(&slob_rcu->head, kmem_rcu_free);
641 } else { 644 } else {
642 __kmem_cache_free(b, c->size); 645 __kmem_cache_free(b, c->size);
643 } 646 }
644 647
645 trace_kmem_cache_free(_RET_IP_, b); 648 trace_kmem_cache_free(_RET_IP_, b);
646 } 649 }
647 EXPORT_SYMBOL(kmem_cache_free); 650 EXPORT_SYMBOL(kmem_cache_free);
648 651
649 unsigned int kmem_cache_size(struct kmem_cache *c) 652 unsigned int kmem_cache_size(struct kmem_cache *c)
650 { 653 {
651 return c->size; 654 return c->size;
652 } 655 }
653 EXPORT_SYMBOL(kmem_cache_size); 656 EXPORT_SYMBOL(kmem_cache_size);
654 657
655 const char *kmem_cache_name(struct kmem_cache *c) 658 const char *kmem_cache_name(struct kmem_cache *c)
656 { 659 {
657 return c->name; 660 return c->name;
658 } 661 }
659 EXPORT_SYMBOL(kmem_cache_name); 662 EXPORT_SYMBOL(kmem_cache_name);
660 663
661 int kmem_cache_shrink(struct kmem_cache *d) 664 int kmem_cache_shrink(struct kmem_cache *d)
662 { 665 {
663 return 0; 666 return 0;
664 } 667 }
665 EXPORT_SYMBOL(kmem_cache_shrink); 668 EXPORT_SYMBOL(kmem_cache_shrink);
666 669
667 int kmem_ptr_validate(struct kmem_cache *a, const void *b) 670 int kmem_ptr_validate(struct kmem_cache *a, const void *b)
668 { 671 {
669 return 0; 672 return 0;
670 } 673 }
671 674
672 static unsigned int slob_ready __read_mostly; 675 static unsigned int slob_ready __read_mostly;
673 676
674 int slab_is_available(void) 677 int slab_is_available(void)
675 { 678 {
676 return slob_ready; 679 return slob_ready;
677 } 680 }
678 681
679 void __init kmem_cache_init(void) 682 void __init kmem_cache_init(void)
680 { 683 {
681 slob_ready = 1; 684 slob_ready = 1;
682 } 685 }
683 686