Eric Lee / smarc-fsl-linux-kernel

14 Apr, 2008

6 commits

0f389ec63 slub: No need for per node slab counters if !SLUB_DEBUG ... Browse Code »

The per node counters are used mainly for showing data through the sysfs API.
If that API is not compiled in then there is no point in keeping track of this
data. Disable counters for the number of slabs and the number of total slabs
if !SLUB_DEBUG. Incrementing the per node counters is also accessing a
potentially contended cacheline so this could actually be a performance
benefit to embedded systems.

SLABINFO support is also affected. It now must depends on SLUB_DEBUG (which
is on by default).

Patch also avoids a check for a NULL kmem_cache_node pointer in new_slab()
if the system is not compiled with NUMA support.

[penberg@cs.helsinki.fi: fix oops and move ->nr_slabs into CONFIG_SLUB_DEBUG]
Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2008-04-14 23:53:02 +0800
49bd5221c slub: Move map/flag clearing to __free_slab ... Browse Code »

__free_slab does some diagnostics. The resetting of mapcount etc
in discard_slab() can interfere with debug processing. So move
the reset immediately before the page is freed.

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2008-04-14 23:52:18 +0800
50ef37b96 slub: Fixes to per cpu stat output in sysfs ... Browse Code »

Only output per cpu stats if the kernel is build for SMP.

Use a capital "C" as a leading character for the processor number
(same as the numa statistics that also use a capital letter "N").

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2008-04-14 23:52:05 +0800
5b06c853a slub: Deal with config variable dependencies ... Browse Code »

count_partial() is used by both slabinfo and the sysfs proc support. Move
the function directly before the beginning of the sysfs code so that it can
be easily found. Rework the preprocessor conditional to take into account
that slub sysfs support depends on CONFIG_SYSFS *and* CONFIG_SLUB_DEBUG.

Make CONFIG_SLUB_STATS depend on CONFIG_SLUB_DEBUG and CONFIG_SYSFS. There
is no point of keeping statistics if no one can restrive them.

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2008-04-14 23:51:34 +0800
4097d6017 slub: Reduce #ifdef ZONE_DMA by moving kmalloc_caches_dma near dma logic ... Browse Code »

Move the definition of kmalloc_caches_dma() into a later #ifdef CONFIG_ZONE_DMA.
This saves one #ifdef and leaves us with a total of two #ifdefs for dma slab support.

Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Christoph Lameter
2008-04-14 23:51:18 +0800
62f75532b slub: Initialize per-cpu stats ... Browse Code »

As spotted by kmemcheck, we need to initialize the per-CPU ->stat array before
using it.

[kmem_cache_cpu structures are usually allocated from arrays defined via
DEFINE_PER_CPU that are zeroed so we have not noticed this so far --cl].

Reported-by: Vegard Nossum
Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg

Pekka Enberg
2008-04-14 23:50:44 +0800

02 Apr, 2008

1 commit

00460dd5f Fix undefined count_partial if !CONFIG_SLABINFO ... Browse Code »

Small typo in the patch recently merged to avoid the unused symbol
message for count_partial(). Discussion thread with confirmation of fix at
http://marc.info/?t=120696854400001&r=1&w=2

Typo in the check if we need the count_partial function that was
introduced by 53625b4204753b904addd40ca96d9ba802e6977d

Signed-off-by: Christoph Lameter
Signed-off-by: Linus Torvalds

Christoph Lameter
2008-04-02 03:44:06 +0800

28 Mar, 2008

1 commit

e72e9c23e Revert "SLUB: remove useless masking of GFP_ZERO" ... Browse Code »

This reverts commit 3811dbf67162bd08412f1b0e02e554f353e93bdb.

The masking was not at all useless, and it was sensible. We handle
GFP_ZERO in the caller, and passing it down to any page allocator logic
is buggy and wrong.

Signed-off-by: Linus Torvalds

Linus Torvalds
2008-03-28 11:56:33 +0800

27 Mar, 2008

1 commit

53625b420 count_partial() is not used if !SLUB_DEBUG and !CONFIG_SLABINFO ... Browse Code »

Avoid warnings about unused functions if neither SLUB_DEBUG nor CONFIG_SLABINFO
is defined. This patch will be reversed when slab defrag is merged since slab
defrag requires count_partial() to determine the fragmentation status of
slab caches.

Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-27 01:42:28 +0800

18 Mar, 2008

1 commit

caeab084d slub page alloc fallback: Enable interrupts for GFP_WAIT. ... Browse Code »

The fallback path needs to enable interrupts like done for
the other page allocator calls. This was not necessary with
the alternate fast path since we handled irq enable/disable in
the slow path. The regular fastpath handles irq enable/disable
around calls to the slow path so we need to restore the proper
status before calling the page allocator from the slowpath.

Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-18 02:14:17 +0800

07 Mar, 2008

2 commits

b62103867 slub: Do not cross cacheline boundaries for very small objects ... Browse Code »

SLUB should pack even small objects nicely into cachelines if that is what
has been asked for. Use the same algorithm as SLAB for this.

The effect of this patch for a system with a cacheline size of 64
bytes is that the 24 byte sized slab caches will now put exactly
2 objects into a cacheline instead of 3 with some overlap into
the next cacheline. This reduces the object density in a 4k slab
from 170 to 128 objects (same as SLAB).

Signed-off-by: Nick Piggin
Signed-off-by: Christoph Lameter

Nick Piggin
2008-03-07 08:21:50 +0800
b773ad736 slub statistics: Fix check for DEACTIVATE_REMOTE_FREES ... Browse Code »

The remote frees are in the freelist of the page and not in the
percpu freelist.

Reviewed-by: Pekka Enberg
Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-07 08:21:49 +0800

04 Mar, 2008

11 commits

62e5c4b4d slub: fix possible NULL pointer dereference ... Browse Code »

This patch fix possible NULL pointer dereference if kzalloc
failed. To be able to return proper error code the function
return type is changed to ssize_t (according to callees and
sysfs definitions).

Signed-off-by: Cyrill Gorcunov
Signed-off-by: Christoph Lameter

Cyrill Gorcunov
2008-03-04 04:22:32 +0800
f619cfe1b slub: Add kmalloc_large_node() to support kmalloc_node fallback ... Browse Code »

Slub is missing some NUMA support for large kmallocs. Provide that.

Reviewed-by: Pekka Enberg
Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-04 04:22:32 +0800
769314348 slub: look up object from the freelist once ... Browse Code »

We only need to look up object from c->page->freelist once in
__slab_alloc().

Signed-off-by: Pekka Enberg
Signed-off-by: Christoph Lameter

Pekka J Enberg
2008-03-04 04:22:32 +0800
6446faa2f slub: Fix up comments ... Browse Code »

Provide comments and fix up various spelling / style issues.

Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-04 04:22:32 +0800
d8b42bf54 slub: Rearrange #ifdef CONFIG_SLUB_DEBUG in calculate_sizes() ... Browse Code »

Group SLUB_DEBUG code together to reduce the number of #ifdefs. Move some
debug checks under the #ifdef.

Reviewed-by: Pekka Enberg
Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-04 04:22:31 +0800
ae20bfda6 slub: Remove BUG_ON() from ksize and omit checks for !SLUB_DEBUG ... Browse Code »

The BUG_ONs are useless since the pointer derefs will lead to
NULL deref errors anyways. Some of the checks are not necessary
if no debugging is possible.

Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-04 04:22:31 +0800
27d9e4e94 slub: Use the objsize from the kmem_cache_cpu structure ... Browse Code »

No need to access the kmem_cache structure. We have the same value
in kmem_cache_cpu.

Reviewed-by: Pekka Enberg
Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-04 04:22:31 +0800
d692ef6dc slub: Remove useless checks in alloc_debug_processing ... Browse Code »

Alloc debug processing is never called with a NULL object pointer.
No reason to check for NULL.

Reviewed-by: Pekka Enberg
Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-04 04:22:31 +0800
e153362a5 slub: Remove objsize check in kmem_cache_flags() ... Browse Code »

There is no page->offset anymore and also no associated limit on the number
of objects. The page->offset field was removed for 2.6.24. So the check
in kmem_cache_flags() is now also obsolete (should have been dropped
earlier, somehow a hunk vanished).

Reviewed-by: Pekka Enberg
Signed-by: Christoph Lameter

Christoph Lameter
2008-03-04 04:22:30 +0800
d9acf4b7b slub: rename slab_objects to show_slab_objects ... Browse Code »

The sysfs callback is better named show_slab_objects since it is always
called from the xxx_show callbacks. We need the name for other purposes
later.

Reviewed-by: Pekka Enberg
Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-04 04:22:30 +0800
a973e9dd1 Revert "unique end pointer" patch ... Browse Code »

This only made sense for the alternate fastpath which was reverted last week.

Mathieu is working on a new version that addresses the fastpath issues but that
new code first needs to go through mm and it is not clear if we need the
unique end pointers with his new scheme.

Reviewed-by: Pekka Enberg
Signed-off-by: Christoph Lameter

Christoph Lameter
2008-03-04 04:22:30 +0800

20 Feb, 2008

1 commit

00e962c54 Revert "SLUB: Alternate fast paths using cmpxchg_local" ... Browse Code »

This reverts commit 1f84260c8ce3b1ce26d4c1d6dedc2f33a3a29c0c, which is
suspected to be the reason for some very occasional and hard-to-trigger
crashes that usually look related to memory allocation (mostly reported
in networking, but since that's generally the most common source of
shortlived allocations - and allocations in interrupt contexts - that in
itself is not a big clue).

See for example
http://bugzilla.kernel.org/show_bug.cgi?id=9973
http://lkml.org/lkml/2008/2/19/278
etc.

One promising suspicion for what the root cause of bug is (which also
explains why it's so hard to trigger in practice) came from Eric
Dumazet:

"I wonder how SLUB_FASTPATH is supposed to work, since it is affected
by a classical ABA problem of lockless algo.

cmpxchg_local(&c->freelist, object, object[c->offset]) can succeed,
while an interrupt came (on this cpu), and several allocations were
done, and one free was performed at the end of this interruption, so
'object' was recycled.

c->freelist can then contain the previous value (object), but
object[c->offset] was changed by IRQ.

We then put back in freelist an already allocated object."

but another reason for the revert is simply that everybody agrees that
this code was the main suspect just by virtue of the pattern of oopses.

Cc: Torsten Kaiser
Cc: Christoph Lameter
Cc: Mathieu Desnoyers
Cc: Pekka Enberg
Cc: Ingo Molnar
Cc: Eric Dumazet
Signed-off-by: Linus Torvalds

Linus Torvalds
2008-02-20 01:08:49 +0800

15 Feb, 2008

5 commits

331dc558f slub: Support 4k kmallocs again to compensate for page allocator slowness ... Browse Code »

Currently we hand off PAGE_SIZEd kmallocs to the page allocator in the
mistaken belief that the page allocator can handle these allocations
effectively. However, measurements indicate a minimum slowdown by the
factor of 8 (and that is only SMP, NUMA is much worse) vs the slub fastpath
which causes regressions in tbench.

Increase the number of kmalloc caches by one so that we again handle 4k
kmallocs directly from slub. 4k page buffering for the page allocator
will be performed by slub like done by slab.

At some point the page allocator fastpath should be fixed. A lot of the kernel
would benefit from a faster ability to allocate a single page. If that is
done then the 4k allocs may again be forwarded to the page allocator and this
patch could be reverted.

Reviewed-by: Pekka Enberg
Acked-by: Mel Gorman
Signed-off-by: Christoph Lameter

Christoph Lameter
2008-02-15 07:30:02 +0800
71c7a06ff slub: Fallback to kmalloc_large for failing higher order allocs ... Browse Code »

Slub already has two ways of allocating an object. One is via its own
logic and the other is via the call to kmalloc_large to hand off object
allocation to the page allocator. kmalloc_large is typically used
for objects >= PAGE_SIZE.

We can use that handoff to avoid failing if a higher order kmalloc slab
allocation cannot be satisfied by the page allocator. If we reach the
out of memory path then simply try a kmalloc_large(). kfree() can
already handle the case of an object that was allocated via the page
allocator and so this will work just fine (apart from object
accounting...).

For any kmalloc slab that already requires higher order allocs (which
makes it impossible to use the page allocator fastpath!)
we just use PAGE_ALLOC_COSTLY_ORDER to get the largest number of
objects in one go from the page allocator slowpath.

On a 4k platform this patch will lead to the following use of higher
order pages for the following kmalloc slabs:

8 ... 1024 order 0
2048 .. 4096 order 3 (4k slab only after the next patch)

We may waste some space if fallback occurs on a 2k slab but we
are always able to fallback to an order 0 alloc.

Reviewed-by: Pekka Enberg
Signed-off-by: Christoph Lameter

Christoph Lameter
2008-02-15 07:30:01 +0800
b7a49f0d4 slub: Determine gfpflags once and not every time a slab is allocated ... Browse Code »

Currently we determine the gfp flags to pass to the page allocator
each time a slab is being allocated.

Determine the bits to be set at the time the slab is created. Store
in a new allocflags field and add the flags in allocate_slab().

Acked-by: Mel Gorman
Reviewed-by: Pekka Enberg
Signed-off-by: Christoph Lameter

Christoph Lameter
2008-02-15 07:30:01 +0800
dada123d9 make slub.c:slab_address() static ... Browse Code »

slab_address() can become static.

Signed-off-by: Adrian Bunk
Signed-off-by: Christoph Lameter

Adrian Bunk
2008-02-15 07:30:01 +0800
eada35efc slub: kmalloc page allocator pass-through cleanup ... Browse Code »

This adds a proper function for kmalloc page allocator pass-through. While it
simplifies any code that does slab tracing code a lot, I think it's a
worthwhile cleanup in itself.

Signed-off-by: Pekka Enberg
Signed-off-by: Christoph Lameter

Pekka Enberg
2008-02-15 07:30:01 +0800

08 Feb, 2008

6 commits

3adbefee6 SLUB: fix checkpatch warnings ... Browse Code »

fix checkpatch --file mm/slub.c errors and warnings.

$ q-code-quality-compare
errors lines of code errors/KLOC
mm/slub.c [before] 22 4204 5.2
mm/slub.c [after] 0 4210 0

no code changed:

text data bss dec hex filename
22195 8634 136 30965 78f5 slub.o.before
22195 8634 136 30965 78f5 slub.o.after

md5:
93cdfbec2d6450622163c590e1064358 slub.o.before.asm
93cdfbec2d6450622163c590e1064358 slub.o.after.asm

[clameter: rediffed against Pekka's cleanup patch, omitted
moves of the name of a function to the start of line]
Signed-off-by: Ingo Molnar
Signed-off-by: Christoph Lameter

Ingo Molnar
2008-02-08 09:52:39 +0800
a76d35462 Use non atomic unlock ... Browse Code »

Slub can use the non-atomic version to unlock because other flags will not
get modified with the lock held.

Signed-off-by: Nick Piggin
Acked-by: Christoph Lameter
Signed-off-by: Andrew Morton

Nick Piggin
2008-02-08 09:47:42 +0800
8ff12cfc0 SLUB: Support for performance statistics ... Browse Code »

The statistics provided here allow the monitoring of allocator behavior but
at the cost of some (minimal) loss of performance. Counters are placed in
SLUB's per cpu data structure. The per cpu structure may be extended by the
statistics to grow larger than one cacheline which will increase the cache
footprint of SLUB.

There is a compile option to enable/disable the inclusion of the runtime
statistics and its off by default.

The slabinfo tool is enhanced to support these statistics via two options:

-D Switches the line of information displayed for a slab from size
mode to activity mode.

-A Sorts the slabs displayed by activity. This allows the display of
the slabs most important to the performance of a certain load.

-r Report option will report detailed statistics on

Example (tbench load):

slabinfo -AD ->Shows the most active slabs

Name Objects Alloc Free %Fast
skbuff_fclone_cache 33 111953835 111953835 99 99
:0000192 2666 5283688 5281047 99 99
:0001024 849 5247230 5246389 83 83
vm_area_struct 1349 119642 118355 91 22
:0004096 15 66753 66751 98 98
:0000064 2067 25297 23383 98 78
dentry 10259 28635 18464 91 45
:0000080 11004 18950 8089 98 98
:0000096 1703 12358 10784 99 98
:0000128 762 10582 9875 94 18
:0000512 184 9807 9647 95 81
:0002048 479 9669 9195 83 65
anon_vma 777 9461 9002 99 71
kmalloc-8 6492 9981 5624 99 97
:0000768 258 7174 6931 58 15

So the skbuff_fclone_cache is of highest importance for the tbench load.
Pretty high load on the 192 sized slab. Look for the aliases

slabinfo -a | grep 000192
:0000192 -r option implied if cache name is mentioned

.... Usual output ...

Slab Perf Counter Alloc Free %Al %Fr
--------------------------------------------------
Fastpath 111953360 111946981 99 99
Slowpath 1044 7423 0 0
Page Alloc 272 264 0 0
Add partial 25 325 0 0
Remove partial 86 264 0 0
RemoteObj/SlabFrozen 350 4832 0 0
Total 111954404 111954404

Flushes 49 Refill 0
Deactivate Full=325(92%) Empty=0(0%) ToHead=24(6%) ToTail=1(0%)

Looks good because the fastpath is overwhelmingly taken.

skbuff_head_cache:

Slab Perf Counter Alloc Free %Al %Fr
--------------------------------------------------
Fastpath 5297262 5259882 99 99
Slowpath 4477 39586 0 0
Page Alloc 937 824 0 0
Add partial 0 2515 0 0
Remove partial 1691 824 0 0
RemoteObj/SlabFrozen 2621 9684 0 0
Total 5301739 5299468

Deactivate Full=2620(100%) Empty=0(0%) ToHead=0(0%) ToTail=0(0%)

Descriptions of the output:

Total: The total number of allocation and frees that occurred for a
slab

Fastpath: The number of allocations/frees that used the fastpath.

Slowpath: Other allocations

Page Alloc: Number of calls to the page allocator as a result of slowpath
processing

Add Partial: Number of slabs added to the partial list through free or
alloc (occurs during cpuslab flushes)

Remove Partial: Number of slabs removed from the partial list as a result of
allocations retrieving a partial slab or by a free freeing
the last object of a slab.

RemoteObj/Froz: How many times were remotely freed object encountered when a
slab was about to be deactivated. Frozen: How many times was
free able to skip list processing because the slab was in use
as the cpuslab of another processor.

Flushes: Number of times the cpuslab was flushed on request
(kmem_cache_shrink, may result from races in __slab_alloc)

Refill: Number of times we were able to refill the cpuslab from
remotely freed objects for the same slab.

Deactivate: Statistics how slabs were deactivated. Shows how they were
put onto the partial list.

In general fastpath is very good. Slowpath without partial list processing is
also desirable. Any touching of partial list uses node specific locks which
may potentially cause list lock contention.

Signed-off-by: Christoph Lameter

Christoph Lameter
2008-02-08 09:47:41 +0800
1f84260c8 SLUB: Alternate fast paths using cmpxchg_local ... Browse Code »

Provide an alternate implementation of the SLUB fast paths for alloc
and free using cmpxchg_local. The cmpxchg_local fast path is selected
for arches that have CONFIG_FAST_CMPXCHG_LOCAL set. An arch should only
set CONFIG_FAST_CMPXCHG_LOCAL if the cmpxchg_local is faster than an
interrupt enable/disable sequence. This is known to be true for both
x86 platforms so set FAST_CMPXCHG_LOCAL for both arches.

Currently another requirement for the fastpath is that the kernel is
compiled without preemption. The restriction will go away with the
introduction of a new per cpu allocator and new per cpu operations.

The advantages of a cmpxchg_local based fast path are:

1. Potentially lower cycle count (30%-60% faster)

2. There is no need to disable and enable interrupts on the fast path.
Currently interrupts have to be disabled and enabled on every
slab operation. This is likely avoiding a significant percentage
of interrupt off / on sequences in the kernel.

3. The disposal of freed slabs can occur with interrupts enabled.

The alternate path is realized using #ifdef's. Several attempts to do the
same with macros and inline functions resulted in a mess (in particular due
to the strange way that local_interrupt_save() handles its argument and due
to the need to define macros/functions that sometimes disable interrupts
and sometimes do something else).

[clameter: Stripped preempt bits and disabled fastpath if preempt is enabled]
Signed-off-by: Christoph Lameter
Reviewed-by: Pekka Enberg
Cc:
Signed-off-by: Andrew Morton

Christoph Lameter
2008-02-08 09:47:41 +0800
683d0baad SLUB: Use unique end pointer for each slab page. ... Browse Code »

We use a NULL pointer on freelists to signal that there are no more objects.
However the NULL pointers of all slabs match in contrast to the pointers to
the real objects which are in different ranges for different slab pages.

Change the end pointer to be a pointer to the first object and set bit 0.
Every slab will then have a different end pointer. This is necessary to ensure
that end markers can be matched to the source slab during cmpxchg_local.

Bring back the use of the mapping field by SLUB since we would otherwise have
to call a relatively expensive function page_address() in __slab_alloc(). Use
of the mapping field allows avoiding a call to page_address() in various other
functions as well.

There is no need to change the page_mapping() function since bit 0 is set on
the mapping as also for anonymous pages. page_mapping(slab_page) will
therefore still return NULL although the mapping field is overloaded.

Signed-off-by: Christoph Lameter
Cc: Pekka Enberg
Signed-off-by: Andrew Morton

Christoph Lameter
2008-02-08 09:47:41 +0800
5bb983b0c SLUB: Deal with annoying gcc warning on kfree() ... Browse Code »

gcc 4.2 spits out an annoying warning if one casts a const void *
pointer to a void * pointer. No warning is generated if the
conversion is done through an assignment.

Signed-off-by: Christoph Lameter

Christoph Lameter
2008-02-08 09:47:41 +0800

05 Feb, 2008

5 commits

ba84c73c7 SLUB: Do not upset lockdep ... Browse Code »

inconsistent {softirq-on-W} -> {in-softirq-W} usage.
swapper/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
(&n->list_lock){-+..}, at: [] add_partial+0x31/0xa0
{softirq-on-W} state was registered at:
[] __lock_acquire+0x3e8/0x1140
[] debug_check_no_locks_freed+0x188/0x1a0
[] lock_acquire+0x55/0x70
[] add_partial+0x31/0xa0
[] _spin_lock+0x1e/0x30
[] add_partial+0x31/0xa0
[] kmem_cache_open+0x1cc/0x330
[] _spin_unlock_irq+0x24/0x30
[] create_kmalloc_cache+0x64/0xf0
[] init_alloc_cpu_cpu+0x70/0x90
[] kmem_cache_init+0x65/0x1d0
[] start_kernel+0x23e/0x350
[] _sinittext+0x12d/0x140
[] 0xffffffffffffffff

This change isn't really necessary for correctness, but it prevents lockdep
from getting upset and then disabling itself.

Signed-off-by: Peter Zijlstra
Cc: Christoph Lameter
Cc: Kamalesh Babulal
Signed-off-by: Andrew Morton
Signed-off-by: Christoph Lameter

root
2008-02-05 02:56:02 +0800
064287807 SLUB: Fix coding style violations ... Browse Code »

This fixes most of the obvious coding style violations in mm/slub.c as
reported by checkpatch.

Acked-by: Christoph Lameter
Signed-off-by: Pekka Enberg
Signed-off-by: Andrew Morton
Signed-off-by: Christoph Lameter

Pekka Enberg
2008-02-05 02:56:02 +0800
7c2e132c5 Add parameter to add_partial to avoid having two functions ... Browse Code »

Add a parameter to add_partial instead of having separate functions. The
parameter allows a more detailed control of where the slab pages is placed in
the partial queues.

If we put slabs back to the front then they are likely immediately used for
allocations. If they are put at the end then we can maximize the time that
the partial slabs spent without being subject to allocations.

When deactivating slab we can put the slabs that had remote objects freed (we
can see that because objects were put on the freelist that requires locks) to
them at the end of the list so that the cachelines of remote processors can
cool down. Slabs that had objects from the local cpu freed to them (objects
exist in the lockless freelist) are put in the front of the list to be reused
ASAP in order to exploit the cache hot state of the local cpu.

Patch seems to slightly improve tbench speed (1-2%).

Signed-off-by: Christoph Lameter
Reviewed-by: Pekka Enberg
Signed-off-by: Andrew Morton

Christoph Lameter
2008-02-05 02:56:02 +0800
9824601ea SLUB: rename defrag to remote_node_defrag_ratio ... Browse Code »

The NUMA defrag works by allocating objects from partial slabs on remote
nodes. Rename it to

remote_node_defrag_ratio

to be clear about this.

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton

Christoph Lameter
2008-02-05 02:56:02 +0800
f61396aed Move count_partial before kmem_cache_shrink ... Browse Code »

Move the counting function for objects in partial slabs so that it is placed
before kmem_cache_shrink.

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton

Christoph Lameter
2008-02-05 02:56:01 +0800