29 Jul, 2016

2 commits

  • For KASAN builds:
    - switch SLUB allocator to using stackdepot instead of storing the
    allocation/deallocation stacks in the objects;
    - change the freelist hook so that parts of the freelist can be put
    into the quarantine.

    [aryabinin@virtuozzo.com: fixes]
    Link: http://lkml.kernel.org/r/1468601423-28676-1-git-send-email-aryabinin@virtuozzo.com
    Link: http://lkml.kernel.org/r/1468347165-41906-3-git-send-email-glider@google.com
    Signed-off-by: Alexander Potapenko
    Cc: Andrey Konovalov
    Cc: Christoph Lameter
    Cc: Dmitry Vyukov
    Cc: Steven Rostedt (Red Hat)
    Cc: Joonsoo Kim
    Cc: Kostya Serebryany
    Cc: Andrey Ryabinin
    Cc: Kuthonuzo Luruo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Potapenko
     
  • When looking up the nearest SLUB object for a given address, correctly
    calculate its offset if SLAB_RED_ZONE is enabled for that cache.

    Previously, when KASAN had detected an error on an object from a cache
    with SLAB_RED_ZONE set, the actual start address of the object was
    miscalculated, which led to random stacks having been reported.

    When looking up the nearest SLUB object for a given address, correctly
    calculate its offset if SLAB_RED_ZONE is enabled for that cache.

    Fixes: 7ed2f9e663854db ("mm, kasan: SLAB support")
    Link: http://lkml.kernel.org/r/1468347165-41906-2-git-send-email-glider@google.com
    Signed-off-by: Alexander Potapenko
    Cc: Andrey Konovalov
    Cc: Christoph Lameter
    Cc: Dmitry Vyukov
    Cc: Steven Rostedt (Red Hat)
    Cc: Joonsoo Kim
    Cc: Kostya Serebryany
    Cc: Andrey Ryabinin
    Cc: Kuthonuzo Luruo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Potapenko
     

27 Jul, 2016

1 commit

  • Implements freelist randomization for the SLUB allocator. It was
    previous implemented for the SLAB allocator. Both use the same
    configuration option (CONFIG_SLAB_FREELIST_RANDOM).

    The list is randomized during initialization of a new set of pages. The
    order on different freelist sizes is pre-computed at boot for
    performance. Each kmem_cache has its own randomized freelist.

    This security feature reduces the predictability of the kernel SLUB
    allocator against heap overflows rendering attacks much less stable.

    For example these attacks exploit the predictability of the heap:
    - Linux Kernel CAN SLUB overflow (https://goo.gl/oMNWkU)
    - Exploiting Linux Kernel Heap corruptions (http://goo.gl/EXLn95)

    Performance results:

    slab_test impact is between 3% to 4% on average for 100000 attempts
    without smp. It is a very focused testing, kernbench show the overall
    impact on the system is way lower.

    Before:

    Single thread testing
    =====================
    1. Kmalloc: Repeatedly allocate then free test
    100000 times kmalloc(8) -> 49 cycles kfree -> 77 cycles
    100000 times kmalloc(16) -> 51 cycles kfree -> 79 cycles
    100000 times kmalloc(32) -> 53 cycles kfree -> 83 cycles
    100000 times kmalloc(64) -> 62 cycles kfree -> 90 cycles
    100000 times kmalloc(128) -> 81 cycles kfree -> 97 cycles
    100000 times kmalloc(256) -> 98 cycles kfree -> 121 cycles
    100000 times kmalloc(512) -> 95 cycles kfree -> 122 cycles
    100000 times kmalloc(1024) -> 96 cycles kfree -> 126 cycles
    100000 times kmalloc(2048) -> 115 cycles kfree -> 140 cycles
    100000 times kmalloc(4096) -> 149 cycles kfree -> 171 cycles
    2. Kmalloc: alloc/free test
    100000 times kmalloc(8)/kfree -> 70 cycles
    100000 times kmalloc(16)/kfree -> 70 cycles
    100000 times kmalloc(32)/kfree -> 70 cycles
    100000 times kmalloc(64)/kfree -> 70 cycles
    100000 times kmalloc(128)/kfree -> 70 cycles
    100000 times kmalloc(256)/kfree -> 69 cycles
    100000 times kmalloc(512)/kfree -> 70 cycles
    100000 times kmalloc(1024)/kfree -> 73 cycles
    100000 times kmalloc(2048)/kfree -> 72 cycles
    100000 times kmalloc(4096)/kfree -> 71 cycles

    After:

    Single thread testing
    =====================
    1. Kmalloc: Repeatedly allocate then free test
    100000 times kmalloc(8) -> 57 cycles kfree -> 78 cycles
    100000 times kmalloc(16) -> 61 cycles kfree -> 81 cycles
    100000 times kmalloc(32) -> 76 cycles kfree -> 93 cycles
    100000 times kmalloc(64) -> 83 cycles kfree -> 94 cycles
    100000 times kmalloc(128) -> 106 cycles kfree -> 107 cycles
    100000 times kmalloc(256) -> 118 cycles kfree -> 117 cycles
    100000 times kmalloc(512) -> 114 cycles kfree -> 116 cycles
    100000 times kmalloc(1024) -> 115 cycles kfree -> 118 cycles
    100000 times kmalloc(2048) -> 147 cycles kfree -> 131 cycles
    100000 times kmalloc(4096) -> 214 cycles kfree -> 161 cycles
    2. Kmalloc: alloc/free test
    100000 times kmalloc(8)/kfree -> 66 cycles
    100000 times kmalloc(16)/kfree -> 66 cycles
    100000 times kmalloc(32)/kfree -> 66 cycles
    100000 times kmalloc(64)/kfree -> 66 cycles
    100000 times kmalloc(128)/kfree -> 65 cycles
    100000 times kmalloc(256)/kfree -> 67 cycles
    100000 times kmalloc(512)/kfree -> 67 cycles
    100000 times kmalloc(1024)/kfree -> 64 cycles
    100000 times kmalloc(2048)/kfree -> 67 cycles
    100000 times kmalloc(4096)/kfree -> 67 cycles

    Kernbench, before:

    Average Optimal load -j 12 Run (std deviation):
    Elapsed Time 101.873 (1.16069)
    User Time 1045.22 (1.60447)
    System Time 88.969 (0.559195)
    Percent CPU 1112.9 (13.8279)
    Context Switches 189140 (2282.15)
    Sleeps 99008.6 (768.091)

    After:

    Average Optimal load -j 12 Run (std deviation):
    Elapsed Time 102.47 (0.562732)
    User Time 1045.3 (1.34263)
    System Time 88.311 (0.342554)
    Percent CPU 1105.8 (6.49444)
    Context Switches 189081 (2355.78)
    Sleeps 99231.5 (800.358)

    Link: http://lkml.kernel.org/r/1464295031-26375-3-git-send-email-thgarnie@google.com
    Signed-off-by: Thomas Garnier
    Reviewed-by: Kees Cook
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Garnier
     

27 May, 2016

1 commit

  • It's unused since commit 7ed2f9e66385 ("mm, kasan: SLAB support")

    Link: http://lkml.kernel.org/r/1464020961-2242-1-git-send-email-aryabinin@virtuozzo.com
    Signed-off-by: Andrey Ryabinin
    Cc: Joonsoo Kim
    Cc: David Rientjes
    Cc: Pekka Enberg
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Ryabinin
     

26 Mar, 2016

1 commit

  • Add KASAN hooks to SLAB allocator.

    This patch is based on the "mm: kasan: unified support for SLUB and SLAB
    allocators" patch originally prepared by Dmitry Chernenkov.

    Signed-off-by: Alexander Potapenko
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Cc: Andrey Konovalov
    Cc: Dmitry Vyukov
    Cc: Andrey Ryabinin
    Cc: Steven Rostedt
    Cc: Konstantin Serebryany
    Cc: Dmitry Chernenkov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Potapenko
     

16 Mar, 2016

1 commit

  • SLUB already has a redzone debugging feature. But it is only positioned
    at the end of object (aka right redzone) so it cannot catch left oob.
    Although current object's right redzone acts as left redzone of next
    object, first object in a slab cannot take advantage of this effect.
    This patch explicitly adds a left red zone to each object to detect left
    oob more precisely.

    Background:

    Someone complained to me that left OOB doesn't catch even if KASAN is
    enabled which does page allocation debugging. That page is out of our
    control so it would be allocated when left OOB happens and, in this
    case, we can't find OOB. Moreover, SLUB debugging feature can be
    enabled without page allocator debugging and, in this case, we will miss
    that OOB.

    Before trying to implement, I expected that changes would be too
    complex, but, it doesn't look that complex to me now. Almost changes
    are applied to debug specific functions so I feel okay.

    Signed-off-by: Joonsoo Kim
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     

21 Jan, 2016

1 commit


14 Feb, 2015

2 commits

  • Remove static and add function declarations to linux/slub_def.h so it
    could be used by kernel address sanitizer.

    Signed-off-by: Andrey Ryabinin
    Cc: Dmitry Vyukov
    Cc: Konstantin Serebryany
    Cc: Dmitry Chernenkov
    Signed-off-by: Andrey Konovalov
    Cc: Yuri Gribov
    Cc: Konstantin Khlebnikov
    Cc: Sasha Levin
    Cc: Christoph Lameter
    Cc: Joonsoo Kim
    Cc: Dave Hansen
    Cc: Andi Kleen
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Ryabinin
     
  • virt_to_obj takes kmem_cache address, address of slab page, address x
    pointing somewhere inside slab object, and returns address of the
    beginning of object.

    Signed-off-by: Andrey Ryabinin
    Acked-by: Christoph Lameter
    Cc: Dmitry Vyukov
    Cc: Konstantin Serebryany
    Cc: Dmitry Chernenkov
    Signed-off-by: Andrey Konovalov
    Cc: Yuri Gribov
    Cc: Konstantin Khlebnikov
    Cc: Sasha Levin
    Cc: Christoph Lameter
    Cc: Joonsoo Kim
    Cc: Dave Hansen
    Cc: Andi Kleen
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Pekka Enberg
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Ryabinin
     

13 Feb, 2015

1 commit

  • Currently, kmem_cache stores a pointer to struct memcg_cache_params
    instead of embedding it. The rationale is to save memory when kmem
    accounting is disabled. However, the memcg_cache_params has shrivelled
    drastically since it was first introduced:

    * Initially:

    struct memcg_cache_params {
    bool is_root_cache;
    union {
    struct kmem_cache *memcg_caches[0];
    struct {
    struct mem_cgroup *memcg;
    struct list_head list;
    struct kmem_cache *root_cache;
    bool dead;
    atomic_t nr_pages;
    struct work_struct destroy;
    };
    };
    };

    * Now:

    struct memcg_cache_params {
    bool is_root_cache;
    union {
    struct {
    struct rcu_head rcu_head;
    struct kmem_cache *memcg_caches[0];
    };
    struct {
    struct mem_cgroup *memcg;
    struct kmem_cache *root_cache;
    };
    };
    };

    So the memory saving does not seem to be a clear win anymore.

    OTOH, keeping a pointer to memcg_cache_params struct instead of embedding
    it results in touching one more cache line on kmem alloc/free hot paths.
    Besides, it makes linking kmem caches in a list chained by a field of
    struct memcg_cache_params really painful due to a level of indirection,
    while I want to make them linked in the following patch. That said, let
    us embed it.

    Signed-off-by: Vladimir Davydov
    Cc: Johannes Weiner
    Cc: Michal Hocko
    Cc: Tejun Heo
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Cc: Dave Chinner
    Cc: Dan Carpenter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     

07 May, 2014

1 commit

  • debugobjects warning during netfilter exit:

    ------------[ cut here ]------------
    WARNING: CPU: 6 PID: 4178 at lib/debugobjects.c:260 debug_print_object+0x8d/0xb0()
    ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x20
    Modules linked in:
    CPU: 6 PID: 4178 Comm: kworker/u16:2 Tainted: G W 3.11.0-next-20130906-sasha #3984
    Workqueue: netns cleanup_net
    Call Trace:
    dump_stack+0x52/0x87
    warn_slowpath_common+0x8c/0xc0
    warn_slowpath_fmt+0x46/0x50
    debug_print_object+0x8d/0xb0
    __debug_check_no_obj_freed+0xa5/0x220
    debug_check_no_obj_freed+0x15/0x20
    kmem_cache_free+0x197/0x340
    kmem_cache_destroy+0x86/0xe0
    nf_conntrack_cleanup_net_list+0x131/0x170
    nf_conntrack_pernet_exit+0x5d/0x70
    ops_exit_list+0x5e/0x70
    cleanup_net+0xfb/0x1c0
    process_one_work+0x338/0x550
    worker_thread+0x215/0x350
    kthread+0xe7/0xf0
    ret_from_fork+0x7c/0xb0

    Also during dcookie cleanup:

    WARNING: CPU: 12 PID: 9725 at lib/debugobjects.c:260 debug_print_object+0x8c/0xb0()
    ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x20
    Modules linked in:
    CPU: 12 PID: 9725 Comm: trinity-c141 Not tainted 3.15.0-rc2-next-20140423-sasha-00018-gc4ff6c4 #408
    Call Trace:
    dump_stack (lib/dump_stack.c:52)
    warn_slowpath_common (kernel/panic.c:430)
    warn_slowpath_fmt (kernel/panic.c:445)
    debug_print_object (lib/debugobjects.c:262)
    __debug_check_no_obj_freed (lib/debugobjects.c:697)
    debug_check_no_obj_freed (lib/debugobjects.c:726)
    kmem_cache_free (mm/slub.c:2689 mm/slub.c:2717)
    kmem_cache_destroy (mm/slab_common.c:363)
    dcookie_unregister (fs/dcookies.c:302 fs/dcookies.c:343)
    event_buffer_release (arch/x86/oprofile/../../../drivers/oprofile/event_buffer.c:153)
    __fput (fs/file_table.c:217)
    ____fput (fs/file_table.c:253)
    task_work_run (kernel/task_work.c:125 (discriminator 1))
    do_notify_resume (include/linux/tracehook.h:196 arch/x86/kernel/signal.c:751)
    int_signal (arch/x86/kernel/entry_64.S:807)

    Sysfs has a release mechanism. Use that to release the kmem_cache
    structure if CONFIG_SYSFS is enabled.

    Only slub is changed - slab currently only supports /proc/slabinfo and
    not /sys/kernel/slab/*. We talked about adding that and someone was
    working on it.

    [akpm@linux-foundation.org: fix CONFIG_SYSFS=n build]
    [akpm@linux-foundation.org: fix CONFIG_SYSFS=n build even more]
    Signed-off-by: Christoph Lameter
    Reported-by: Sasha Levin
    Tested-by: Sasha Levin
    Acked-by: Greg KH
    Cc: Thomas Gleixner
    Cc: Pekka Enberg
    Cc: Russell King
    Cc: Bart Van Assche
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

08 Apr, 2014

1 commit

  • Currently, we try to arrange sysfs entries for memcg caches in the same
    manner as for global caches. Apart from turning /sys/kernel/slab into a
    mess when there are a lot of kmem-active memcgs created, it actually
    does not work properly - we won't create more than one link to a memcg
    cache in case its parent is merged with another cache. For instance, if
    A is a root cache merged with another root cache B, we will have the
    following sysfs setup:

    X
    A -> X
    B -> X

    where X is some unique id (see create_unique_id()). Now if memcgs M and
    N start to allocate from cache A (or B, which is the same), we will get:

    X
    X:M
    X:N
    A -> X
    B -> X
    A:M -> X:M
    A:N -> X:N

    Since B is an alias for A, we won't get entries B:M and B:N, which is
    confusing.

    It is more logical to have entries for memcg caches under the
    corresponding root cache's sysfs directory. This would allow us to keep
    sysfs layout clean, and avoid such inconsistencies like one described
    above.

    This patch does the trick. It creates a "cgroup" kset in each root
    cache kobject to keep its children caches there.

    Signed-off-by: Vladimir Davydov
    Cc: Michal Hocko
    Cc: Johannes Weiner
    Cc: David Rientjes
    Cc: Pekka Enberg
    Cc: Glauber Costa
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     

23 Nov, 2013

1 commit

  • Pull SLAB changes from Pekka Enberg:
    "The patches from Joonsoo Kim switch mm/slab.c to use 'struct page' for
    slab internals similar to mm/slub.c. This reduces memory usage and
    improves performance:

    https://lkml.org/lkml/2013/10/16/155

    Rest of the changes are bug fixes from various people"

    * 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux: (21 commits)
    mm, slub: fix the typo in mm/slub.c
    mm, slub: fix the typo in include/linux/slub_def.h
    slub: Handle NULL parameter in kmem_cache_flags
    slab: replace non-existing 'struct freelist *' with 'void *'
    slab: fix to calm down kmemleak warning
    slub: proper kmemleak tracking if CONFIG_SLUB_DEBUG disabled
    slab: rename slab_bufctl to slab_freelist
    slab: remove useless statement for checking pfmemalloc
    slab: use struct page for slab management
    slab: replace free and inuse in struct slab with newly introduced active
    slab: remove SLAB_LIMIT
    slab: remove kmem_bufctl_t
    slab: change the management method of free objects of the slab
    slab: use __GFP_COMP flag for allocating slab pages
    slab: use well-defined macro, virt_to_slab()
    slab: overloading the RCU head over the LRU for RCU free
    slab: remove cachep in struct slab_rcu
    slab: remove nodeid in struct slab
    slab: remove colouroff in struct slab
    slab: change return type of kmem_getpages() to struct page
    ...

    Linus Torvalds
     

12 Nov, 2013

1 commit


05 Sep, 2013

2 commits

  • I do not see any user for this code in the tree.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • The kmalloc* functions of all slab allcoators are similar now so
    lets move them into slab.h. This requires some function naming changes
    in slob.

    As a results of this patch there is a common set of functions for
    all allocators. Also means that kmalloc_large() is now available
    in general to perform large order allocations that go directly
    via the page allocator. kmalloc_large() can be substituted if
    kmalloc() throws warnings because of too large allocations.

    kmalloc_large() has exactly the same semantics as kmalloc but
    can only used for allocations > PAGE_SIZE.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     

01 Feb, 2013

5 commits

  • Put the definitions for the kmem_cache_node structures together so that
    we have one structure. That will allow us to create more common fields in
    the future which could yield more opportunities to share code.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • Extract the optimized lookup functions from slub and put them into
    slab_common.c. Then make slab use these functions as well.

    Joonsoo notes that this fixes some issues with constant folding which
    also reduces the code size for slub.

    https://lkml.org/lkml/2012/10/20/82

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • Have a common definition fo the kmalloc cache arrays in
    SLAB and SLUB

    Acked-by: Glauber Costa
    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • Standardize the constants that describe the smallest and largest
    object kept in the kmalloc arrays for SLAB and SLUB.

    Differentiate between the maximum size for which a slab cache is used
    (KMALLOC_MAX_CACHE_SIZE) and the maximum allocatable size
    (KMALLOC_MAX_SIZE, KMALLOC_MAX_ORDER).

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • Extract the function to determine the index of the slab within
    the array of kmalloc caches as well as a function to determine
    maximum object size from the nr of the kmalloc slab.

    This is used here only to simplify slub bootstrap but will
    be used later also for SLAB.

    Acked-by: Glauber Costa
    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     

19 Dec, 2012

3 commits

  • SLUB allows us to tune a particular cache behavior with sysfs-based
    tunables. When creating a new memcg cache copy, we'd like to preserve any
    tunables the parent cache already had.

    This can be done by tapping into the store attribute function provided by
    the allocator. We of course don't need to mess with read-only fields.
    Since the attributes can have multiple types and are stored internally by
    sysfs, the best strategy is to issue a ->show() in the root cache, and
    then ->store() in the memcg cache.

    The drawback of that, is that sysfs can allocate up to a page in buffering
    for show(), that we are likely not to need, but also can't guarantee. To
    avoid always allocating a page for that, we can update the caches at store
    time with the maximum attribute size ever stored to the root cache. We
    will then get a buffer big enough to hold it. The corolary to this, is
    that if no stores happened, nothing will be propagated.

    It can also happen that a root cache has its tunables updated during
    normal system operation. In this case, we will propagate the change to
    all caches that are already active.

    [akpm@linux-foundation.org: tweak code to avoid __maybe_unused]
    Signed-off-by: Glauber Costa
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Frederic Weisbecker
    Cc: Greg Thelen
    Cc: Johannes Weiner
    Cc: JoonSoo Kim
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Michal Hocko
    Cc: Pekka Enberg
    Cc: Rik van Riel
    Cc: Suleiman Souhlal
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Glauber Costa
     
  • We are able to match a cache allocation to a particular memcg. If the
    task doesn't change groups during the allocation itself - a rare event,
    this will give us a good picture about who is the first group to touch a
    cache page.

    This patch uses the now available infrastructure by calling
    memcg_kmem_get_cache() before all the cache allocations.

    Signed-off-by: Glauber Costa
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Frederic Weisbecker
    Cc: Greg Thelen
    Cc: Johannes Weiner
    Cc: JoonSoo Kim
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Michal Hocko
    Cc: Pekka Enberg
    Cc: Rik van Riel
    Cc: Suleiman Souhlal
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Glauber Costa
     
  • For the kmem slab controller, we need to record some extra information in
    the kmem_cache structure.

    Signed-off-by: Glauber Costa
    Signed-off-by: Suleiman Souhlal
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Frederic Weisbecker
    Cc: Greg Thelen
    Cc: Johannes Weiner
    Cc: JoonSoo Kim
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Michal Hocko
    Cc: Pekka Enberg
    Cc: Rik van Riel
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Glauber Costa
     

14 Jun, 2012

1 commit

  • Define a struct that describes common fields used in all slab allocators.
    A slab allocator either uses the common definition (like SLOB) or is
    required to provide members of kmem_cache with the definition given.

    After that it will be possible to share code that
    only operates on those fields of kmem_cache.

    The patch basically takes the slob definition of kmem cache and
    uses the field namees for the other allocators.

    It also standardizes the names used for basic object lengths in
    allocators:

    object_size Struct size specified at kmem_cache_create. Basically
    the payload expected to be used by the subsystem.

    size The size of memory allocator for each object. This size
    is larger than object_size and includes padding, alignment
    and extra metadata for each object (f.e. for debugging
    and rcu).

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     

01 Jun, 2012

1 commit

  • The node field is always page_to_nid(c->page). So its rather easy to
    replace. Note that there maybe slightly more overhead in various hot paths
    due to the need to shift the bits from page->flags. However, that is mostly
    compensated for by a smaller footprint of the kmem_cache_cpu structure (this
    patch reduces that to 3 words per cache) which allows better caching.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     

29 Mar, 2012

1 commit

  • Pull SLAB changes from Pekka Enberg:
    "There's the new kmalloc_array() API, minor fixes and performance
    improvements, but quite honestly, nothing terribly exciting."

    * 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux:
    mm: SLAB Out-of-memory diagnostics
    slab: introduce kmalloc_array()
    slub: per cpu partial statistics change
    slub: include include for prefetch
    slub: Do not hold slub_lock when calling sysfs_slab_add()
    slub: prefetch next freelist pointer in slab_alloc()
    slab, cleanup: remove unneeded return

    Linus Torvalds
     

05 Mar, 2012

1 commit

  • If a header file is making use of BUG, BUG_ON, BUILD_BUG_ON, or any
    other BUG variant in a static inline (i.e. not in a #define) then
    that header really should be including and not just
    expecting it to be implicitly present.

    We can make this change risk-free, since if the files using these
    headers didn't have exposure to linux/bug.h already, they would have
    been causing compile failures/warnings.

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     

18 Feb, 2012

1 commit

  • This patch split the cpu_partial_free into 2 parts: cpu_partial_node, PCP refilling
    times from node partial; and same name cpu_partial_free, PCP refilling times in
    slab_free slow path. A new statistic 'cpu_partial_drain' is added to get PCP
    drain to node partial times. These info are useful when do PCP tunning.

    The slabinfo.c code is unchanged, since cpu_partial_node is not on slow path.

    Signed-off-by: Alex Shi
    Acked-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Alex Shi
     

28 Sep, 2011

1 commit


20 Aug, 2011

1 commit

  • Allow filling out the rest of the kmem_cache_cpu cacheline with pointers to
    partial pages. The partial page list is used in slab_free() to avoid
    per node lock taking.

    In __slab_alloc() we can then take multiple partial pages off the per
    node partial list in one go reducing node lock pressure.

    We can also use the per cpu partial list in slab_alloc() to avoid scanning
    partial lists for pages with free objects.

    The main effect of a per cpu partial list is that the per node list_lock
    is taken for batches of partial pages instead of individual ones.

    Potential future enhancements:

    1. The pickup from the partial list could be perhaps be done without disabling
    interrupts with some work. The free path already puts the page into the
    per cpu partial list without disabling interrupts.

    2. __slab_free() may have some code paths that could use optimization.

    Performance:

    Before After
    ./hackbench 100 process 200000
    Time: 1953.047 1564.614
    ./hackbench 100 process 20000
    Time: 207.176 156.940
    ./hackbench 100 process 20000
    Time: 204.468 156.940
    ./hackbench 100 process 20000
    Time: 204.879 158.772
    ./hackbench 10 process 20000
    Time: 20.153 15.853
    ./hackbench 10 process 20000
    Time: 20.153 15.986
    ./hackbench 10 process 20000
    Time: 19.363 16.111
    ./hackbench 1 process 20000
    Time: 2.518 2.307
    ./hackbench 1 process 20000
    Time: 2.258 2.339
    ./hackbench 1 process 20000
    Time: 2.864 2.163

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     

31 Jul, 2011

1 commit

  • * 'slub/lockless' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6: (21 commits)
    slub: When allocating a new slab also prep the first object
    slub: disable interrupts in cmpxchg_double_slab when falling back to pagelock
    Avoid duplicate _count variables in page_struct
    Revert "SLUB: Fix build breakage in linux/mm_types.h"
    SLUB: Fix build breakage in linux/mm_types.h
    slub: slabinfo update for cmpxchg handling
    slub: Not necessary to check for empty slab on load_freelist
    slub: fast release on full slab
    slub: Add statistics for the case that the current slab does not match the node
    slub: Get rid of the another_slab label
    slub: Avoid disabling interrupts in free slowpath
    slub: Disable interrupts in free_debug processing
    slub: Invert locking and avoid slab lock
    slub: Rework allocator fastpaths
    slub: Pass kmem_cache struct to lock and freeze slab
    slub: explicit list_lock taking
    slub: Add cmpxchg_double_slab()
    mm: Rearrange struct page
    slub: Move page->frozen handling near where the page->freelist handling occurs
    slub: Do not use frozen page flag but a bit in the page counters
    ...

    Linus Torvalds
     

08 Jul, 2011

1 commit


02 Jul, 2011

3 commits


17 Jun, 2011

1 commit

  • Every slab has its on alignment definition in include/linux/sl?b_def.h. Extract those
    and define a common set in include/linux/slab.h.

    SLOB: As notes sometimes we need double word alignment on 32 bit. This gives all
    structures allocated by SLOB a unsigned long long alignment like the others do.

    SLAB: If ARCH_SLAB_MINALIGN is not set SLAB would set ARCH_SLAB_MINALIGN to
    zero meaning no alignment at all. Give it the default unsigned long long alignment.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     

21 May, 2011

1 commit


08 May, 2011

1 commit

  • Remove the #ifdefs. This means that the irqsafe_cpu_cmpxchg_double() is used
    everywhere.

    There may be performance implications since:

    A. We now have to manage a transaction ID for all arches

    B. The interrupt holdoff for arches not supporting CONFIG_CMPXCHG_LOCAL is reduced
    to a very short irqoff section.

    There are no multiple irqoff/irqon sequences as a result of this change. Even in the fallback
    case we only have to do one disable and enable like before.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     

23 Mar, 2011

1 commit