23 Mar, 2011

2 commits


21 Mar, 2011

2 commits


12 Mar, 2011

3 commits


11 Mar, 2011

2 commits

  • Use the this_cpu_cmpxchg_double functionality to implement a lockless
    allocation algorithm on arches that support fast this_cpu_ops.

    Each of the per cpu pointers is paired with a transaction id that ensures
    that updates of the per cpu information can only occur in sequence on
    a certain cpu.

    A transaction id is a "long" integer that is comprised of an event number
    and the cpu number. The event number is incremented for every change to the
    per cpu state. This means that the cmpxchg instruction can verify for an
    update that nothing interfered and that we are updating the percpu structure
    for the processor where we picked up the information and that we are also
    currently on that processor when we update the information.

    This results in a significant decrease of the overhead in the fastpaths. It
    also makes it easy to adopt the fast path for realtime kernels since this
    is lockless and does not require the use of the current per cpu area
    over the critical section. It is only important that the per cpu area is
    current at the beginning of the critical section and at the end.

    So there is no need even to disable preemption.

    Test results show that the fastpath cycle count is reduced by up to ~ 40%
    (alloc/free test goes from ~140 cycles down to ~80). The slowpath for kfree
    adds a few cycles.

    Sadly this does nothing for the slowpath which is where the main issues with
    performance in slub are but the best case performance rises significantly.
    (For that see the more complex slub patches that require cmpxchg_double)

    Kmalloc: alloc/free test

    Before:

    10000 times kmalloc(8)/kfree -> 134 cycles
    10000 times kmalloc(16)/kfree -> 152 cycles
    10000 times kmalloc(32)/kfree -> 144 cycles
    10000 times kmalloc(64)/kfree -> 142 cycles
    10000 times kmalloc(128)/kfree -> 142 cycles
    10000 times kmalloc(256)/kfree -> 132 cycles
    10000 times kmalloc(512)/kfree -> 132 cycles
    10000 times kmalloc(1024)/kfree -> 135 cycles
    10000 times kmalloc(2048)/kfree -> 135 cycles
    10000 times kmalloc(4096)/kfree -> 135 cycles
    10000 times kmalloc(8192)/kfree -> 144 cycles
    10000 times kmalloc(16384)/kfree -> 754 cycles

    After:

    10000 times kmalloc(8)/kfree -> 78 cycles
    10000 times kmalloc(16)/kfree -> 78 cycles
    10000 times kmalloc(32)/kfree -> 82 cycles
    10000 times kmalloc(64)/kfree -> 88 cycles
    10000 times kmalloc(128)/kfree -> 79 cycles
    10000 times kmalloc(256)/kfree -> 79 cycles
    10000 times kmalloc(512)/kfree -> 85 cycles
    10000 times kmalloc(1024)/kfree -> 82 cycles
    10000 times kmalloc(2048)/kfree -> 82 cycles
    10000 times kmalloc(4096)/kfree -> 85 cycles
    10000 times kmalloc(8192)/kfree -> 82 cycles
    10000 times kmalloc(16384)/kfree -> 706 cycles

    Kmalloc: Repeatedly allocate then free test

    Before:

    10000 times kmalloc(8) -> 211 cycles kfree -> 113 cycles
    10000 times kmalloc(16) -> 174 cycles kfree -> 115 cycles
    10000 times kmalloc(32) -> 235 cycles kfree -> 129 cycles
    10000 times kmalloc(64) -> 222 cycles kfree -> 120 cycles
    10000 times kmalloc(128) -> 343 cycles kfree -> 139 cycles
    10000 times kmalloc(256) -> 827 cycles kfree -> 147 cycles
    10000 times kmalloc(512) -> 1048 cycles kfree -> 272 cycles
    10000 times kmalloc(1024) -> 2043 cycles kfree -> 528 cycles
    10000 times kmalloc(2048) -> 4002 cycles kfree -> 571 cycles
    10000 times kmalloc(4096) -> 7740 cycles kfree -> 628 cycles
    10000 times kmalloc(8192) -> 8062 cycles kfree -> 850 cycles
    10000 times kmalloc(16384) -> 8895 cycles kfree -> 1249 cycles

    After:

    10000 times kmalloc(8) -> 190 cycles kfree -> 129 cycles
    10000 times kmalloc(16) -> 76 cycles kfree -> 123 cycles
    10000 times kmalloc(32) -> 126 cycles kfree -> 124 cycles
    10000 times kmalloc(64) -> 181 cycles kfree -> 128 cycles
    10000 times kmalloc(128) -> 310 cycles kfree -> 140 cycles
    10000 times kmalloc(256) -> 809 cycles kfree -> 165 cycles
    10000 times kmalloc(512) -> 1005 cycles kfree -> 269 cycles
    10000 times kmalloc(1024) -> 1999 cycles kfree -> 527 cycles
    10000 times kmalloc(2048) -> 3967 cycles kfree -> 570 cycles
    10000 times kmalloc(4096) -> 7658 cycles kfree -> 637 cycles
    10000 times kmalloc(8192) -> 8111 cycles kfree -> 859 cycles
    10000 times kmalloc(16384) -> 8791 cycles kfree -> 1173 cycles

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • The following patch will make the fastpaths lockless and will no longer
    require interrupts to be disabled. Calling the free hook with irq disabled
    will no longer be possible.

    Move the slab_free_hook_irq() logic into slab_free_hook. Only disable
    interrupts if the features are selected that require callbacks with
    interrupts off and reenable after calls have been made.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     

27 Feb, 2011

1 commit

  • mm/slub.c: In function 'ksize':
    mm/slub.c:2728: error: implicit declaration of function 'slab_ksize'

    slab_ksize() needs to go out of CONFIG_SLUB_DEBUG section.

    Acked-by: Randy Dunlap
    Acked-by: David Rientjes
    Signed-off-by: Mariusz Kozlowski
    Signed-off-by: Pekka Enberg

    Mariusz Kozlowski
     

23 Feb, 2011

1 commit

  • Recent use of ksize() in network stack (commit ca44ac38 : net: don't
    reallocate skb->head unless the current one hasn't the needed extra size
    or is shared) triggers kmemcheck warnings, because ksize() can return
    more space than kmemcheck is aware of.

    Pekka Enberg noticed SLAB+kmemcheck is doing the right thing, while SLUB
    +kmemcheck doesnt.

    Bugzilla reference #27212

    Reported-by: Christian Casteyde
    Suggested-by: Pekka Enberg
    Signed-off-by: Eric Dumazet
    Acked-by: David S. Miller
    Acked-by: David Rientjes
    Acked-by: Christoph Lameter
    CC: Changli Gao
    CC: Andrew Morton
    Signed-off-by: Pekka Enberg

    Eric Dumazet
     

24 Jan, 2011

1 commit


15 Jan, 2011

1 commit


14 Jan, 2011

1 commit


11 Jan, 2011

2 commits

  • The purpose of the locking is to prevent removal and additions
    of nodes when statistics are gathered for a slab cache. So we
    need to avoid racing with memory hotplug functionality.

    It is enough to take the memory hotplug locks there instead
    of the slub_lock.

    online_pages() currently does not acquire the memory_hotplug
    lock. Another patch will be submitted by the memory hotplug
    authors to take the memory hotplug lock and describe the
    uses of the memory hotplug lock to protect against
    adding and removal of nodes from non hotplug data structures.

    Cc: # 2.6.37
    Reported-and-tested-by: Bart Van Assche
    Acked-by: David Rientjes
    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
    slub: Fix a crash during slabinfo -v
    tracing/slab: Move kmalloc tracepoint out of inline code
    slub: Fix slub_lock down/up imbalance
    slub: Fix build breakage in Documentation/vm
    slub tracing: move trace calls out of always inlined functions to reduce kernel code size
    slub: move slabinfo.c to tools/slub/slabinfo.c

    Linus Torvalds
     

07 Jan, 2011

1 commit


04 Dec, 2010

2 commits

  • Commit f7cb1933621bce66a77f690776a16fe3ebbc4d58 ("SLUB: Pass active
    and inactive redzone flags instead of boolean to debug functions")
    missed two instances of check_object(). This caused a lot of warnings
    during 'slabinfo -v' finally leading to a crash:

    BUG ext4_xattr: Freepointer corrupt
    ...
    BUG buffer_head: Freepointer corrupt
    ...
    BUG ext4_alloc_context: Freepointer corrupt
    ...
    ...
    BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
    IP: [] file_sb_list_del+0x1c/0x35
    PGD 79d78067 PUD 79e67067 PMD 0
    Oops: 0002 [#1] SMP
    last sysfs file: /sys/kernel/slab/:t-0000192/validate

    This patch fixes the problem by converting the two missed instances.

    Acked-by: Christoph Lameter
    Signed-off-by: Tero Roponen
    Signed-off-by: Pekka Enberg

    Tero Roponen
     
  • Commit f7cb1933621bce66a77f690776a16fe3ebbc4d58 ("SLUB: Pass active
    and inactive redzone flags instead of boolean to debug functions")
    missed two instances of check_object(). This caused a lot of warnings
    during 'slabinfo -v' finally leading to a crash:

    BUG ext4_xattr: Freepointer corrupt
    ...
    BUG buffer_head: Freepointer corrupt
    ...
    BUG ext4_alloc_context: Freepointer corrupt
    ...
    ...
    BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
    IP: [] file_sb_list_del+0x1c/0x35
    PGD 79d78067 PUD 79e67067 PMD 0
    Oops: 0002 [#1] SMP
    last sysfs file: /sys/kernel/slab/:t-0000192/validate

    This patch fixes the problem by converting the two missed instances.

    Acked-by: Christoph Lameter
    Signed-off-by: Tero Roponen
    Signed-off-by: Pekka Enberg

    Tero Roponen
     

14 Nov, 2010

1 commit

  • There are two places, that do not release the slub_lock.

    Respective bugs were introduced by sysfs changes ab4d5ed5 (slub: Enable
    sysfs support for !CONFIG_SLUB_DEBUG) and 2bce6485 ( slub: Allow removal
    of slab caches during boot).

    Acked-by: Christoph Lameter
    Signed-off-by: Pavel Emelyanov
    Signed-off-by: Pekka Enberg

    Pavel Emelyanov
     

06 Nov, 2010

2 commits

  • There are two places, that do not release the slub_lock.

    Respective bugs were introduced by sysfs changes ab4d5ed5 (slub: Enable
    sysfs support for !CONFIG_SLUB_DEBUG) and 2bce6485 ( slub: Allow removal
    of slab caches during boot).

    Acked-by: Christoph Lameter
    Signed-off-by: Pavel Emelyanov
    Signed-off-by: Pekka Enberg

    Pavel Emelyanov
     
  • Having the trace calls defined in the always inlined kmalloc functions
    in include/linux/slub_def.h causes a lot of code duplication as the
    trace functions get instantiated for each kamalloc call site. This can
    simply be removed by pushing the trace calls down into the functions in
    slub.c.

    On my x86_64 built this patch shrinks the code size of the kernel by
    approx 36K and also shrinks the code size of many modules -- too many to
    list here ;)

    size vmlinux (2.6.36) reports
    text data bss dec hex filename
    5410611 743172 828928 6982711 6a8c37 vmlinux
    5373738 744244 828928 6946910 6a005e vmlinux + patch

    The resulting kernel has had some testing & kmalloc trace still seems to
    work.

    This patch
    - moves trace_kmalloc out of the inlined kmalloc() and pushes it down
    into kmem_cache_alloc_trace() so this it only get instantiated once.

    - rename kmem_cache_alloc_notrace() to kmem_cache_alloc_trace() to
    indicate that now is does have tracing. (maybe this would better being
    called something like kmalloc_kmem_cache ?)

    - adds a new function kmalloc_order() to handle allocation and tracing
    of large allocations of page order.

    - removes tracing from the inlined kmalloc_large() replacing them with a
    call to kmalloc_order();

    - move tracing out of inlined kmalloc_node() and pushing it down into
    kmem_cache_alloc_node_trace

    - rename kmem_cache_alloc_node_notrace() to
    kmem_cache_alloc_node_trace()

    - removes the include of trace/events/kmem.h from slub_def.h.

    v2
    - keep kmalloc_order_trace inline when !CONFIG_TRACE

    Signed-off-by: Richard Kennedy
    Signed-off-by: Pekka Enberg

    Richard Kennedy
     

07 Oct, 2010

1 commit

  • This patch fixes the following build breakage when memory hotplug is enabled on
    UMA configurations:

    /home/test/linux-2.6/mm/slub.c: In function 'kmem_cache_init':
    /home/test/linux-2.6/mm/slub.c:3031:2: error: 'slab_memory_callback'
    undeclared (first use in this function)
    /home/test/linux-2.6/mm/slub.c:3031:2: note: each undeclared
    identifier is reported only once for each function it appears in
    make[2]: *** [mm/slub.o] Error 1
    make[1]: *** [mm] Error 2
    make: *** [sub-make] Error 2

    Reported-by: Zimny Lech
    Acked-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Pekka Enberg
     

06 Oct, 2010

3 commits

  • There is a lot of #ifdef/#endifs that can be avoided if functions would be in different
    places. Move them around and reduce #ifdef.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • Currently disabling CONFIG_SLUB_DEBUG also disabled SYSFS support meaning
    that the slabs cannot be tuned without DEBUG.

    Make SYSFS support independent of CONFIG_SLUB_DEBUG

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • This patch optimizes slab_free() debug check to use "c->node != NUMA_NO_NODE"
    instead of "c->node >= 0" because the former generates smaller code on x86-64:

    Before:

    4736: 48 39 70 08 cmp %rsi,0x8(%rax)
    473a: 75 26 jne 4762
    473c: 44 8b 48 10 mov 0x10(%rax),%r9d
    4740: 45 85 c9 test %r9d,%r9d
    4743: 78 1d js 4762

    After:

    4736: 48 39 70 08 cmp %rsi,0x8(%rax)
    473a: 75 23 jne 475f
    473c: 83 78 10 ff cmpl $0xffffffffffffffff,0x10(%rax)
    4740: 74 1d je 475f

    This patch also cleans up __slab_alloc() to use NUMA_NO_NODE instead of "-1"
    for enabling debugging for a per-CPU cache.

    Acked-by: Christoph Lameter
    Acked-by: David Rientjes
    Signed-off-by: Pekka Enberg

    Pekka Enberg
     

02 Oct, 2010

14 commits

  • Make kmalloc_cache_alloc_node_notrace(), kmalloc_large_node()
    and __kmalloc_node_track_caller() to be compiled only when
    CONFIG_NUMA is selected.

    Acked-by: David Rientjes
    Signed-off-by: Namhyung Kim
    Signed-off-by: Pekka Enberg

    Namhyung Kim
     
  • The unfreeze_slab() releases page's PG_locked bit but was missing
    proper annotation. The deactivate_slab() needs to be marked also
    since it calls unfreeze_slab() without grabbing the lock.

    Acked-by: David Rientjes
    Signed-off-by: Namhyung Kim
    Signed-off-by: Pekka Enberg

    Namhyung Kim
     
  • The bit-ops routines require its arg to be a pointer to unsigned long.
    This leads sparse to complain about different signedness as follows:

    mm/slub.c:2425:49: warning: incorrect type in argument 2 (different signedness)
    mm/slub.c:2425:49: expected unsigned long volatile *addr
    mm/slub.c:2425:49: got long *map

    Acked-by: Christoph Lameter
    Acked-by: David Rientjes
    Signed-off-by: Namhyung Kim
    Signed-off-by: Pekka Enberg

    Namhyung Kim
     
  • There are a couple of places where repeat the same statements when removing
    a page from the partial list. Consolidate that into __remove_partial().

    Acked-by: David Rientjes
    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • Pass the actual values used for inactive and active redzoning to the
    functions that check the objects. Avoids a lot of the ? : things to
    lookup the values in the functions.

    Acked-by: David Rientjes
    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • Reduce the #ifdefs and simplify bootstrap by making SMP and NUMA as much alike
    as possible. This means that there will be an additional indirection to get to
    the kmem_cache_node field under SMP.

    Acked-by: David Rientjes
    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • This reverts commit 5249d039500f05a5ab379286b1d23ab9b04d3f2c. It's not needed
    after commit bbddff0545878a8649c091a9dd7c43ce91516734 ("percpu: use percpu
    allocator on UP too").

    Pekka Enberg
     
  • As explained by Linus "I'm Proud to be an American" Torvalds:

    Looking at the merging code, I actually think it's totally
    buggy. If you have something like this:

    - load module A: create slab cache A

    - load module B: create slab cache B that can merge with A

    - unload module A

    - "cat /proc/slabinfo": BOOM. Oops.

    exactly because the name is not handled correctly, and you'll have
    module B holding open a slab cache that has a name pointer that points
    to module A that no longer exists.

    This patch fixes the problem by using kstrdup() to allocate dynamic memory for
    ->name of "struct kmem_cache" as suggested by Christoph Lameter.

    Acked-by: Christoph Lameter
    Cc: David Rientjes
    Reported-by: Linus Torvalds
    Signed-off-by: Pekka Enberg

    Conflicts:

    mm/slub.c

    Pekka Enberg
     
  • Since the percpu allocator does not provide early allocation in UP mode (only
    in SMP configurations) use __get_free_page() to improvise a compound page
    allocation that can be later freed via kfree().

    Compound pages will be released when the cpu caches are resized.

    Acked-by: David Rientjes
    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • Now that the kmalloc_caches array is dynamically allocated at boot,
    SLUB_RESILIENCY_TEST needs to be fixed to pass the correct type.

    Acked-by: Christoph Lameter
    Signed-off-by: David Rientjes
    Signed-off-by: Pekka Enberg

    David Rientjes
     
  • Memory hotplug allocates and frees per node structures. Use the correct name.

    Acked-by: David Rientjes
    Acked-by: Randy Dunlap
    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • On Wed, 25 Aug 2010, Randy Dunlap wrote:
    > mm/slub.c:1732: error: implicit declaration of function 'slab_pre_alloc_hook'
    > mm/slub.c:1751: error: implicit declaration of function 'slab_post_alloc_hook'
    > mm/slub.c:1881: error: implicit declaration of function 'slab_free_hook'
    > mm/slub.c:1886: error: implicit declaration of function 'slab_free_hook_irq'

    Empty functions are missing if the runtime debuggability option is compiled
    out.

    Provide the fall back functions to empty hooks if SLUB_DEBUG is not set.

    Acked-by: Randy Dunlap
    Acked-by: David Rientjes
    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • Move the gfpflags masking into the hooks for checkers and into the slowpaths.
    gfpflag masking requires access to a global variable and thus adds an
    additional cacheline reference to the hotpaths.

    If no hooks are active then the gfpflag masking will result in
    code that the compiler can toss out.

    Acked-by: David Rientjes
    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • Extract the code that memory checkers and other verification tools use from
    the hotpaths. Makes it easier to add new ones and reduces the disturbances
    of the hotpaths.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter