11 Aug, 2010

1 commit

  • Now each architecture has the own dma_get_cache_alignment implementation.

    dma_get_cache_alignment returns the minimum DMA alignment. Architectures
    define it as ARCH_KMALLOC_MINALIGN (it's used to make sure that malloc'ed
    buffer is DMA-safe; the buffer doesn't share a cache with the others). So
    we can unify dma_get_cache_alignment implementations.

    This patch:

    dma_get_cache_alignment() needs to know if an architecture defines
    ARCH_KMALLOC_MINALIGN or not (needs to know if architecture has DMA
    alignment restriction). However, slab.h define ARCH_KMALLOC_MINALIGN if
    architectures doesn't define it.

    Let's rename ARCH_KMALLOC_MINALIGN to ARCH_DMA_MINALIGN.
    ARCH_KMALLOC_MINALIGN is used only in the internals of slab/slob/slub
    (except for crypto).

    Signed-off-by: FUJITA Tomonori
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    FUJITA Tomonori
     

10 Jun, 2010

1 commit


09 Jun, 2010

1 commit

  • We have been resisting new ftrace plugins and removing existing
    ones, and kmemtrace has been superseded by kmem trace events
    and perf-kmem, so we remove it.

    Signed-off-by: Li Zefan
    Acked-by: Pekka Enberg
    Acked-by: Eduard - Gabriel Munteanu
    Cc: Ingo Molnar
    Cc: Steven Rostedt
    [ remove kmemtrace from the makefile, handle slob too ]
    Signed-off-by: Frederic Weisbecker

    Li Zefan
     

30 May, 2010

1 commit

  • Commit 756dee75872a2a764b478e18076360b8a4ec9045 ("SLUB: Get rid of dynamic DMA
    kmalloc cache allocation") makes S390 run out of kmalloc caches. Increase the
    number of kmalloc caches to a safe size.

    Cc: [ .33 and .34 ]
    Reported-by: Heiko Carstens
    Tested-by: Heiko Carstens
    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     

25 May, 2010

1 commit

  • This patch is meant to improve the performance of SLUB by moving the local
    kmem_cache_node lock into it's own cacheline separate from kmem_cache.
    This is accomplished by simply removing the local_node when NUMA is enabled.

    On my system with 2 nodes I saw around a 5% performance increase w/
    hackbench times dropping from 6.2 seconds to 5.9 seconds on average. I
    suspect the performance gain would increase as the number of nodes
    increases, but I do not have the data to currently back that up.

    Bugzilla-Reference: http://bugzilla.kernel.org/show_bug.cgi?id=15713
    Cc:
    Reported-by: Alex Shi
    Tested-by: Alex Shi
    Acked-by: Yanmin Zhang
    Acked-by: Christoph Lameter
    Signed-off-by: Alexander Duyck
    Signed-off-by: Pekka Enberg

    Alexander Duyck
     

20 May, 2010

1 commit


20 Dec, 2009

3 commits

  • Remove the fields in struct kmem_cache_cpu that were used to cache data from
    struct kmem_cache when they were in different cachelines. The cacheline that
    holds the per cpu array pointer now also holds these values. We can cut down
    the struct kmem_cache_cpu size to almost half.

    The get_freepointer() and set_freepointer() functions that used to be only
    intended for the slow path now are also useful for the hot path since access
    to the size field does not require accessing an additional cacheline anymore.
    This results in consistent use of functions for setting the freepointer of
    objects throughout SLUB.

    Also we initialize all possible kmem_cache_cpu structures when a slab is
    created. No need to initialize them when a processor or node comes online.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • Dynamic DMA kmalloc cache allocation is troublesome since the
    new percpu allocator does not support allocations in atomic contexts.
    Reserve some statically allocated kmalloc_cpu structures instead.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • Using per cpu allocations removes the needs for the per cpu arrays in the
    kmem_cache struct. These could get quite big if we have to support systems
    with thousands of cpus. The use of this_cpu_xx operations results in:

    1. The size of kmem_cache for SMP configuration shrinks since we will only
    need 1 pointer instead of NR_CPUS. The same pointer can be used by all
    processors. Reduces cache footprint of the allocator.

    2. We can dynamically size kmem_cache according to the actual nodes in the
    system meaning less memory overhead for configurations that may potentially
    support up to 1k NUMA nodes / 4k cpus.

    3. We can remove the diddle widdle with allocating and releasing of
    kmem_cache_cpu structures when bringing up and shutting down cpus. The cpu
    alloc logic will do it all for us. Removes some portions of the cpu hotplug
    functionality.

    4. Fastpath performance increases since per cpu pointer lookups and
    address calculations are avoided.

    V7-V8
    - Convert missed get_cpu_slab() under CONFIG_SLUB_STATS

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     

11 Dec, 2009

1 commit

  • Define kmem_trace_alloc_{,node}_notrace() if CONFIG_TRACING is
    enabled, otherwise perf-kmem will show wrong stats ifndef
    CONFIG_KMEM_TRACE, because a kmalloc() memory allocation may
    be traced by both trace_kmalloc() and trace_kmem_cache_alloc().

    Signed-off-by: Li Zefan
    Reviewed-by: Pekka Enberg
    Cc: Christoph Lameter
    Cc: Steven Rostedt
    Cc: Frederic Weisbecker
    Cc: linux-mm@kvack.org
    Cc: Eduard - Gabriel Munteanu
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Li Zefan
     

15 Sep, 2009

1 commit


30 Aug, 2009

1 commit

  • If the minalign is 64 bytes, then the 96 byte cache should not be created
    because it would conflict with the 128 byte cache.

    If the minalign is 256 bytes, patching the size_index table should not
    result in a buffer overrun.

    The calculation "(i - 1) / 8" used to access size_index[] is moved to
    a separate function as suggested by Christoph Lameter.

    Acked-by: Christoph Lameter
    Signed-off-by: Aaro Koskinen
    Signed-off-by: Pekka Enberg

    Aaro Koskinen
     

06 Aug, 2009

1 commit


08 Jul, 2009

1 commit


12 Jun, 2009

1 commit

  • As explained by Benjamin Herrenschmidt:

    Oh and btw, your patch alone doesn't fix powerpc, because it's missing
    a whole bunch of GFP_KERNEL's in the arch code... You would have to
    grep the entire kernel for things that check slab_is_available() and
    even then you'll be missing some.

    For example, slab_is_available() didn't always exist, and so in the
    early days on powerpc, we used a mem_init_done global that is set form
    mem_init() (not perfect but works in practice). And we still have code
    using that to do the test.

    Therefore, mask out __GFP_WAIT, __GFP_IO, and __GFP_FS in the slab allocators
    in early boot code to avoid enabling interrupts.

    Signed-off-by: Pekka Enberg

    Pekka Enberg
     

12 Apr, 2009

1 commit

  • Impact: refactor code for future changes

    Current kmemtrace.h is used both as header file of kmemtrace and kmem's
    tracepoints definition.

    Tracepoints' definition file may be used by other code, and should only have
    definition of tracepoint.

    We can separate include/trace/kmemtrace.h into 2 files:

    include/linux/kmemtrace.h: header file for kmemtrace
    include/trace/kmem.h: definition of kmem tracepoints

    Signed-off-by: Zhao Lei
    Acked-by: Eduard - Gabriel Munteanu
    Acked-by: Pekka Enberg
    Cc: Steven Rostedt
    Cc: Frederic Weisbecker
    Cc: Tom Zanussi
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Zhaolei
     

03 Apr, 2009

1 commit

  • kmemtrace now uses tracepoints instead of markers. We no longer need to
    use format specifiers to pass arguments.

    Signed-off-by: Eduard - Gabriel Munteanu
    [ folded: Use the new TP_PROTO and TP_ARGS to fix the build. ]
    [ folded: fix build when CONFIG_KMEMTRACE is disabled. ]
    [ folded: define tracepoints when CONFIG_TRACEPOINTS is enabled. ]
    Signed-off-by: Pekka Enberg
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Eduard - Gabriel Munteanu
     

02 Apr, 2009

1 commit


24 Mar, 2009

1 commit


23 Feb, 2009

1 commit


20 Feb, 2009

4 commits

  • …/slab-2.6 into tracing/kmemtrace

    Conflicts:
    mm/slub.c

    Ingo Molnar
     
  • As a preparational patch to bump up page allocator pass-through threshold,
    introduce two new constants SLUB_MAX_SIZE and SLUB_PAGE_SHIFT and convert
    mm/slub.c to use them.

    Reported-by: "Zhang, Yanmin"
    Tested-by: "Zhang, Yanmin"
    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • Increase the maximum object size in SLUB so that 8k objects are not
    passed through to the page allocator anymore. The network stack uses 8k
    objects for performance critical operations.

    The patch is motivated by a SLAB vs. SLUB regression in the netperf
    benchmark. The problem is that the kfree(skb->head) call in
    skb_release_data() that is subject to page allocator pass-through as the
    size passed to __alloc_skb() is larger than 4 KB in this test.

    As explained by Yanmin Zhang:

    I use 2.6.29-rc2 kernel to run netperf UDP-U-4k CPU_NUM client/server
    pair loopback testing on x86-64 machines. Comparing with SLUB, SLAB's
    result is about 2.3 times of SLUB's. After applying the reverting patch,
    the result difference between SLUB and SLAB becomes 1% which we might
    consider as fluctuation.

    [ penberg@cs.helsinki.fi: fix oops in kmalloc() ]
    Reported-by: "Zhang, Yanmin"
    Tested-by: "Zhang, Yanmin"
    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Pekka Enberg
     
  • As a preparational patch to bump up page allocator pass-through threshold,
    introduce two new constants SLUB_MAX_SIZE and SLUB_PAGE_SHIFT and convert
    mm/slub.c to use them.

    Reported-by: "Zhang, Yanmin"
    Tested-by: "Zhang, Yanmin"
    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     

30 Dec, 2008

1 commit

  • Impact: new tracer plugin

    This patch adapts kmemtrace raw events tracing to the unified tracing API.

    To enable and use this tracer, just do the following:

    echo kmemtrace > /debugfs/tracing/current_tracer
    cat /debugfs/tracing/trace

    You will have the following output:

    # tracer: kmemtrace
    #
    #
    # ALLOC TYPE REQ GIVEN FLAGS POINTER NODE CALLER
    # FREE | | | | | | | |
    # |

    type_id 1 call_site 18446744071565527833 ptr 18446612134395152256
    type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
    type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
    type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
    type_id 0 call_site 18446744071565636711 ptr 18446612134345164672 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1
    type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
    type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
    type_id 0 call_site 18446744071565636711 ptr 18446612134345164912 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1
    type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
    type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
    type_id 0 call_site 18446744071565636711 ptr 18446612134345165152 bytes_req 240 bytes_alloc 240 gfp_flags 208 node -1
    type_id 0 call_site 18446744071566144042 ptr 18446612134346191680 bytes_req 1304 bytes_alloc 1312 gfp_flags 208 node -1
    type_id 1 call_site 18446744071565585534 ptr 18446612134405955584
    type_id 0 call_site 18446744071565585597 ptr 18446612134405955584 bytes_req 4096 bytes_alloc 4096 gfp_flags 208 node -1
    type_id 1 call_site 18446744071565585534 ptr 18446612134405955584

    That was to stay backward compatible with the format output produced in
    inux/tracepoint.h.

    This is the default ouput, but note that I tried something else.

    If you change an option:

    echo kmem_minimalistic > /debugfs/trace_options

    and then cat /debugfs/trace, you will have the following output:

    # tracer: kmemtrace
    #
    #
    # ALLOC TYPE REQ GIVEN FLAGS POINTER NODE CALLER
    # FREE | | | | | | | |
    # |

    - C 0xffff88007c088780 file_free_rcu
    + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname
    - C 0xffff88007cad6000 putname
    + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname
    + K 240 240 000000d0 0xffff8800790dc780 -1 d_alloc
    - C 0xffff88007cad6000 putname
    + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname
    + K 240 240 000000d0 0xffff8800790dc870 -1 d_alloc
    - C 0xffff88007cad6000 putname
    + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname
    + K 240 240 000000d0 0xffff8800790dc960 -1 d_alloc
    + K 1304 1312 000000d0 0xffff8800791d7340 -1 reiserfs_alloc_inode
    - C 0xffff88007cad6000 putname
    + K 4096 4096 000000d0 0xffff88007cad6000 -1 getname
    - C 0xffff88007cad6000 putname
    + K 992 1000 000000d0 0xffff880079045b58 -1 alloc_inode
    + K 768 1024 000080d0 0xffff88007c096400 -1 alloc_pipe_info
    + K 240 240 000000d0 0xffff8800790dca50 -1 d_alloc
    + K 272 320 000080d0 0xffff88007c088780 -1 get_empty_filp
    + K 272 320 000080d0 0xffff88007c088000 -1 get_empty_filp

    Yeah I shall confess kmem_minimalistic should be: kmem_alternative.

    Whatever, I find it more readable but this a personal opinion of course.
    We can drop it if you want.

    On the ALLOC/FREE column, + means an allocation and - a free.

    On the type column, you have K = kmalloc, C = cache, P = page

    I would like the flags to be GFP_* strings but that would not be easy to not
    break the column with strings....

    About the node...it seems to always be -1. I don't know why but that shouldn't
    be difficult to find.

    I moved linux/tracepoint.h to trace/tracepoint.h as well. I think that would
    be more easy to find the tracer headers if they are all in their common
    directory.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

29 Dec, 2008

1 commit


05 Aug, 2008

1 commit

  • This patch changes the static MIN_PARTIAL to a dynamic per-cache ->min_partial
    value that is calculated from object size. The bigger the object size, the more
    pages we keep on the partial list.

    I tested SLAB, SLUB, and SLUB with this patch on Jens Axboe's 'netio' example
    script of the fio benchmarking tool. The script stresses the networking
    subsystem which should also give a fairly good beating of kmalloc() et al.

    To run the test yourself, first clone the fio repository:

    git clone git://git.kernel.dk/fio.git

    and then run the following command n times on your machine:

    time ./fio examples/netio

    The results on my 2-way 64-bit x86 machine are as follows:

    [ the minimum, maximum, and average are captured from 50 individual runs ]

    real time (seconds)
    min max avg sd
    SLAB 22.76 23.38 22.98 0.17
    SLUB 22.80 25.78 23.46 0.72
    SLUB (dynamic) 22.74 23.54 23.00 0.20

    sys time (seconds)
    min max avg sd
    SLAB 6.90 8.28 7.70 0.28
    SLUB 7.42 16.95 8.89 2.28
    SLUB (dynamic) 7.17 8.64 7.73 0.29

    user time (seconds)
    min max avg sd
    SLAB 36.89 38.11 37.50 0.29
    SLUB 30.85 37.99 37.06 1.67
    SLUB (dynamic) 36.75 38.07 37.59 0.32

    As you can see from the above numbers, this patch brings SLUB to the same level
    as SLAB for this particular workload fixing a ~2% regression. I'd expect this
    change to help similar workloads that allocate a lot of objects that are close
    to the size of a page.

    Cc: Matthew Wilcox
    Cc: Andrew Morton
    Acked-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Pekka Enberg
     

27 Jul, 2008

1 commit

  • Kmem cache passed to constructor is only needed for constructors that are
    themselves multiplexeres. Nobody uses this "feature", nor does anybody uses
    passed kmem cache in non-trivial way, so pass only pointer to object.

    Non-trivial places are:
    arch/powerpc/mm/init_64.c
    arch/powerpc/mm/hugetlbpage.c

    This is flag day, yes.

    Signed-off-by: Alexey Dobriyan
    Acked-by: Pekka Enberg
    Acked-by: Christoph Lameter
    Cc: Jon Tollefson
    Cc: Nick Piggin
    Cc: Matt Mackall
    [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c]
    [akpm@linux-foundation.org: fix mm/slab.c]
    [akpm@linux-foundation.org: fix ubifs]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

05 Jul, 2008

1 commit

  • Remove all clameter@sgi.com addresses from the kernel tree since they will
    become invalid on June 27th. Change my maintainer email address for the
    slab allocators to cl@linux-foundation.org (which will be the new email
    address for the future).

    Signed-off-by: Christoph Lameter
    Signed-off-by: Christoph Lameter
    Cc: Pekka Enberg
    Cc: Stephen Rothwell
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

04 Jul, 2008

1 commit

  • The 192 byte cache is not necessary if we have a basic alignment of 128
    byte. If it would be used then the 192 would be aligned to the next 128 byte
    boundary which would result in another 256 byte cache. Two 256 kmalloc caches
    cause sysfs to complain about a duplicate entry.

    MIPS needs 128 byte aligned kmalloc caches and spits out warnings on boot without
    this patch.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     

27 Apr, 2008

3 commits

  • If any higher order allocation fails then fall back the smallest order
    necessary to contain at least one object. This enables fallback for all
    allocations to order 0 pages. The fallback will waste more memory (objects
    will not fit neatly) and the fallback slabs will be not as efficient as larger
    slabs since they contain less objects.

    Note that SLAB also depends on order 1 allocations for some slabs that waste
    too much memory if forced into PAGE_SIZE'd page. SLUB now can now deal with
    failing order 1 allocs which SLAB cannot do.

    Add a new field min that will contain the objects for the smallest possible order
    for a slab cache.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • Change the statistics to consider that slabs of the same slabcache
    can have different number of objects in them since they may be of
    different order.

    Provide a new sysfs field

    total_objects

    which shows the total objects that the allocated slabs of a slabcache
    could hold.

    Add a max field that holds the largest slab order that was ever used
    for a slab cache.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • Pack the order and the number of objects into a single word.
    This saves some memory in the kmem_cache_structure and more importantly
    allows us to fetch both values atomically.

    Later the slab orders become runtime configurable and we need to fetch these
    two items together in order to properly allocate a slab and initialize its
    objects.

    Fix the race by fetching the order and the number of objects in one word.

    [penberg@cs.helsinki.fi: fix memset() page order in new_slab()]
    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     

14 Apr, 2008

1 commit

  • The per node counters are used mainly for showing data through the sysfs API.
    If that API is not compiled in then there is no point in keeping track of this
    data. Disable counters for the number of slabs and the number of total slabs
    if !SLUB_DEBUG. Incrementing the per node counters is also accessing a
    potentially contended cacheline so this could actually be a performance
    benefit to embedded systems.

    SLABINFO support is also affected. It now must depends on SLUB_DEBUG (which
    is on by default).

    Patch also avoids a check for a NULL kmem_cache_node pointer in new_slab()
    if the system is not compiled with NUMA support.

    [penberg@cs.helsinki.fi: fix oops and move ->nr_slabs into CONFIG_SLUB_DEBUG]
    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     

04 Mar, 2008

1 commit


15 Feb, 2008

3 commits

  • Currently we hand off PAGE_SIZEd kmallocs to the page allocator in the
    mistaken belief that the page allocator can handle these allocations
    effectively. However, measurements indicate a minimum slowdown by the
    factor of 8 (and that is only SMP, NUMA is much worse) vs the slub fastpath
    which causes regressions in tbench.

    Increase the number of kmalloc caches by one so that we again handle 4k
    kmallocs directly from slub. 4k page buffering for the page allocator
    will be performed by slub like done by slab.

    At some point the page allocator fastpath should be fixed. A lot of the kernel
    would benefit from a faster ability to allocate a single page. If that is
    done then the 4k allocs may again be forwarded to the page allocator and this
    patch could be reverted.

    Reviewed-by: Pekka Enberg
    Acked-by: Mel Gorman
    Signed-off-by: Christoph Lameter

    Christoph Lameter
     
  • Currently we determine the gfp flags to pass to the page allocator
    each time a slab is being allocated.

    Determine the bits to be set at the time the slab is created. Store
    in a new allocflags field and add the flags in allocate_slab().

    Acked-by: Mel Gorman
    Reviewed-by: Pekka Enberg
    Signed-off-by: Christoph Lameter

    Christoph Lameter
     
  • This adds a proper function for kmalloc page allocator pass-through. While it
    simplifies any code that does slab tracing code a lot, I think it's a
    worthwhile cleanup in itself.

    Signed-off-by: Pekka Enberg
    Signed-off-by: Christoph Lameter

    Pekka Enberg
     

08 Feb, 2008

1 commit

  • The statistics provided here allow the monitoring of allocator behavior but
    at the cost of some (minimal) loss of performance. Counters are placed in
    SLUB's per cpu data structure. The per cpu structure may be extended by the
    statistics to grow larger than one cacheline which will increase the cache
    footprint of SLUB.

    There is a compile option to enable/disable the inclusion of the runtime
    statistics and its off by default.

    The slabinfo tool is enhanced to support these statistics via two options:

    -D Switches the line of information displayed for a slab from size
    mode to activity mode.

    -A Sorts the slabs displayed by activity. This allows the display of
    the slabs most important to the performance of a certain load.

    -r Report option will report detailed statistics on

    Example (tbench load):

    slabinfo -AD ->Shows the most active slabs

    Name Objects Alloc Free %Fast
    skbuff_fclone_cache 33 111953835 111953835 99 99
    :0000192 2666 5283688 5281047 99 99
    :0001024 849 5247230 5246389 83 83
    vm_area_struct 1349 119642 118355 91 22
    :0004096 15 66753 66751 98 98
    :0000064 2067 25297 23383 98 78
    dentry 10259 28635 18464 91 45
    :0000080 11004 18950 8089 98 98
    :0000096 1703 12358 10784 99 98
    :0000128 762 10582 9875 94 18
    :0000512 184 9807 9647 95 81
    :0002048 479 9669 9195 83 65
    anon_vma 777 9461 9002 99 71
    kmalloc-8 6492 9981 5624 99 97
    :0000768 258 7174 6931 58 15

    So the skbuff_fclone_cache is of highest importance for the tbench load.
    Pretty high load on the 192 sized slab. Look for the aliases

    slabinfo -a | grep 000192
    :0000192 -r option implied if cache name is mentioned

    .... Usual output ...

    Slab Perf Counter Alloc Free %Al %Fr
    --------------------------------------------------
    Fastpath 111953360 111946981 99 99
    Slowpath 1044 7423 0 0
    Page Alloc 272 264 0 0
    Add partial 25 325 0 0
    Remove partial 86 264 0 0
    RemoteObj/SlabFrozen 350 4832 0 0
    Total 111954404 111954404

    Flushes 49 Refill 0
    Deactivate Full=325(92%) Empty=0(0%) ToHead=24(6%) ToTail=1(0%)

    Looks good because the fastpath is overwhelmingly taken.

    skbuff_head_cache:

    Slab Perf Counter Alloc Free %Al %Fr
    --------------------------------------------------
    Fastpath 5297262 5259882 99 99
    Slowpath 4477 39586 0 0
    Page Alloc 937 824 0 0
    Add partial 0 2515 0 0
    Remove partial 1691 824 0 0
    RemoteObj/SlabFrozen 2621 9684 0 0
    Total 5301739 5299468

    Deactivate Full=2620(100%) Empty=0(0%) ToHead=0(0%) ToTail=0(0%)

    Descriptions of the output:

    Total: The total number of allocation and frees that occurred for a
    slab

    Fastpath: The number of allocations/frees that used the fastpath.

    Slowpath: Other allocations

    Page Alloc: Number of calls to the page allocator as a result of slowpath
    processing

    Add Partial: Number of slabs added to the partial list through free or
    alloc (occurs during cpuslab flushes)

    Remove Partial: Number of slabs removed from the partial list as a result of
    allocations retrieving a partial slab or by a free freeing
    the last object of a slab.

    RemoteObj/Froz: How many times were remotely freed object encountered when a
    slab was about to be deactivated. Frozen: How many times was
    free able to skip list processing because the slab was in use
    as the cpuslab of another processor.

    Flushes: Number of times the cpuslab was flushed on request
    (kmem_cache_shrink, may result from races in __slab_alloc)

    Refill: Number of times we were able to refill the cpuslab from
    remotely freed objects for the same slab.

    Deactivate: Statistics how slabs were deactivated. Shows how they were
    put onto the partial list.

    In general fastpath is very good. Slowpath without partial list processing is
    also desirable. Any touching of partial list uses node specific locks which
    may potentially cause list lock contention.

    Signed-off-by: Christoph Lameter

    Christoph Lameter
     

05 Feb, 2008

1 commit