19 Dec, 2012

5 commits

  • SLAB allows us to tune a particular cache behavior with tunables. When
    creating a new memcg cache copy, we'd like to preserve any tunables the
    parent cache already had.

    This could be done by an explicit call to do_tune_cpucache() after the
    cache is created. But this is not very convenient now that the caches are
    created from common code, since this function is SLAB-specific.

    Another method of doing that is taking advantage of the fact that
    do_tune_cpucache() is always called from enable_cpucache(), which is
    called at cache initialization. We can just preset the values, and then
    things work as expected.

    It can also happen that a root cache has its tunables updated during
    normal system operation. In this case, we will propagate the change to
    all caches that are already active.

    This change will require us to move the assignment of root_cache in
    memcg_params a bit earlier. We need this to be already set - which
    memcg_kmem_register_cache will do - when we reach __kmem_cache_create()

    Signed-off-by: Glauber Costa
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Frederic Weisbecker
    Cc: Greg Thelen
    Cc: Johannes Weiner
    Cc: JoonSoo Kim
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Michal Hocko
    Cc: Pekka Enberg
    Cc: Rik van Riel
    Cc: Suleiman Souhlal
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Glauber Costa
     
  • When we create caches in memcgs, we need to display their usage
    information somewhere. We'll adopt a scheme similar to /proc/meminfo,
    with aggregate totals shown in the global file, and per-group information
    stored in the group itself.

    For the time being, only reads are allowed in the per-group cache.

    Signed-off-by: Glauber Costa
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Frederic Weisbecker
    Cc: Greg Thelen
    Cc: Johannes Weiner
    Cc: JoonSoo Kim
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Michal Hocko
    Cc: Pekka Enberg
    Cc: Rik van Riel
    Cc: Suleiman Souhlal
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Glauber Costa
     
  • Implement destruction of memcg caches. Right now, only caches where our
    reference counter is the last remaining are deleted. If there are any
    other reference counters around, we just leave the caches lying around
    until they go away.

    When that happens, a destruction function is called from the cache code.
    Caches are only destroyed in process context, so we queue them up for
    later processing in the general case.

    Signed-off-by: Glauber Costa
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Frederic Weisbecker
    Cc: Greg Thelen
    Cc: Johannes Weiner
    Cc: JoonSoo Kim
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Michal Hocko
    Cc: Pekka Enberg
    Cc: Rik van Riel
    Cc: Suleiman Souhlal
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Glauber Costa
     
  • Allow a memcg parameter to be passed during cache creation. When the slub
    allocator is being used, it will only merge caches that belong to the same
    memcg. We'll do this by scanning the global list, and then translating
    the cache to a memcg-specific cache

    Default function is created as a wrapper, passing NULL to the memcg
    version. We only merge caches that belong to the same memcg.

    A helper is provided, memcg_css_id: because slub needs a unique cache name
    for sysfs. Since this is visible, but not the canonical location for slab
    data, the cache name is not used, the css_id should suffice.

    Signed-off-by: Glauber Costa
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Frederic Weisbecker
    Cc: Greg Thelen
    Cc: Johannes Weiner
    Cc: JoonSoo Kim
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Michal Hocko
    Cc: Pekka Enberg
    Cc: Rik van Riel
    Cc: Suleiman Souhlal
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Glauber Costa
     
  • For the kmem slab controller, we need to record some extra information in
    the kmem_cache structure.

    Signed-off-by: Glauber Costa
    Signed-off-by: Suleiman Souhlal
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Frederic Weisbecker
    Cc: Greg Thelen
    Cc: Johannes Weiner
    Cc: JoonSoo Kim
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Michal Hocko
    Cc: Pekka Enberg
    Cc: Rik van Riel
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Glauber Costa
     

31 Oct, 2012

1 commit

  • This function is identically defined in all three allocators
    and it's trivial to move it to slab.h

    Since now it's static, inline, header-defined function
    this patch also drops the EXPORT_SYMBOL tag.

    Cc: Pekka Enberg
    Cc: Matt Mackall
    Acked-by: Christoph Lameter
    Signed-off-by: Ezequiel Garcia
    Signed-off-by: Pekka Enberg

    Ezequiel Garcia
     

25 Sep, 2012

1 commit

  • Currently slob falls back to regular kmalloc for this case.
    With this patch kmalloc_track_caller() is correctly implemented,
    thus tracing the specified caller.

    This is important to trace accurately allocations performed by
    krealloc, kstrdup, kmemdup, etc.

    Signed-off-by: Ezequiel Garcia
    Signed-off-by: Pekka Enberg

    Ezequiel Garcia
     

09 Jul, 2012

2 commits


14 Jun, 2012

1 commit

  • Define a struct that describes common fields used in all slab allocators.
    A slab allocator either uses the common definition (like SLOB) or is
    required to provide members of kmem_cache with the definition given.

    After that it will be possible to share code that
    only operates on those fields of kmem_cache.

    The patch basically takes the slob definition of kmem cache and
    uses the field namees for the other allocators.

    It also standardizes the names used for basic object lengths in
    allocators:

    object_size Struct size specified at kmem_cache_create. Basically
    the payload expected to be used by the subsystem.

    size The size of memory allocator for each object. This size
    is larger than object_size and includes padding, alignment
    and extra metadata for each object (f.e. for debugging
    and rcu).

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     

01 Jun, 2012

1 commit

  • ULONG_MAX is often used to check for integer overflow when calculating
    allocation size. While ULONG_MAX happens to work on most systems, there
    is no guarantee that `size_t' must be the same size as `long'.

    This patch introduces SIZE_MAX, the maximum value of `size_t', to improve
    portability and readability for allocation size validation.

    Signed-off-by: Xi Wang
    Acked-by: Alex Elder
    Cc: David Airlie
    Cc: Pekka Enberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xi Wang
     

06 Mar, 2012

1 commit

  • Introduce a kmalloc_array() wrapper that performs integer overflow
    checking without zeroing the memory.

    Suggested-by: Andrew Morton
    Suggested-by: Jens Axboe
    Signed-off-by: Xi Wang
    Cc: Dan Carpenter
    Acked-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Pekka Enberg

    Xi Wang
     

08 Jul, 2011

1 commit


17 Jun, 2011

1 commit

  • Every slab has its on alignment definition in include/linux/sl?b_def.h. Extract those
    and define a common set in include/linux/slab.h.

    SLOB: As notes sometimes we need double word alignment on 32 bit. This gives all
    structures allocated by SLOB a unsigned long long alignment like the others do.

    SLAB: If ARCH_SLAB_MINALIGN is not set SLAB would set ARCH_SLAB_MINALIGN to
    zero meaning no alignment at all. Give it the default unsigned long long alignment.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     

24 Jan, 2011

1 commit


07 Jan, 2011

1 commit


05 Jul, 2010

1 commit

  • In slab, all __xxx_track_caller is defined on CONFIG_DEBUG_SLAB || CONFIG_TRACING,
    thus caller tracking function should be worked for CONFIG_TRACING. But if
    CONFIG_DEBUG_SLAB is not set, include/linux/slab.h will define xxx_track_caller to
    __xxx() without consideration of CONFIG_TRACING. This will break the caller tracking
    behaviour then.

    Cc: Christoph Lameter
    Cc: Matt Mackall
    Cc: Vegard Nossum
    Cc: Dmitry Monakhov
    Cc: Catalin Marinas
    Acked-by: David Rientjes
    Signed-off-by: Xiaotian Feng
    Signed-off-by: Pekka Enberg

    Xiaotian Feng
     

10 Apr, 2010

1 commit

  • As suggested by Linus, introduce a kern_ptr_validate() helper that does some
    sanity checks to make sure a pointer is a valid kernel pointer. This is a
    preparational step for fixing SLUB kmem_ptr_validate().

    Cc: Andrew Morton
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Ingo Molnar
    Cc: Matt Mackall
    Cc: Nick Piggin
    Signed-off-by: Pekka Enberg
    Signed-off-by: Linus Torvalds

    Pekka Enberg
     

27 Feb, 2010

1 commit

  • This patch allow to inject faults only for specific slabs.
    In order to preserve default behavior cache filter is off by
    default (all caches are faulty).

    One may define specific set of slabs like this:
    # mark skbuff_head_cache as faulty
    echo 1 > /sys/kernel/slab/skbuff_head_cache/failslab
    # Turn on cache filter (off by default)
    echo 1 > /sys/kernel/debug/failslab/cache-filter
    # Turn on fault injection
    echo 1 > /sys/kernel/debug/failslab/times
    echo 1 > /sys/kernel/debug/failslab/probability

    Acked-by: David Rientjes
    Acked-by: Akinobu Mita
    Acked-by: Christoph Lameter
    Signed-off-by: Dmitry Monakhov
    Signed-off-by: Pekka Enberg

    Dmitry Monakhov
     

15 Jun, 2009

2 commits

  • Conflicts:
    MAINTAINERS

    Signed-off-by: Vegard Nossum

    Vegard Nossum
     
  • With kmemcheck enabled, the slab allocator needs to do this:

    1. Tell kmemcheck to allocate the shadow memory which stores the status of
    each byte in the allocation proper, e.g. whether it is initialized or
    uninitialized.
    2. Tell kmemcheck which parts of memory that should be marked uninitialized.
    There are actually a few more states, such as "not yet allocated" and
    "recently freed".

    If a slab cache is set up using the SLAB_NOTRACK flag, it will never return
    memory that can take page faults because of kmemcheck.

    If a slab cache is NOT set up using the SLAB_NOTRACK flag, callers can still
    request memory with the __GFP_NOTRACK flag. This does not prevent the page
    faults from occuring, however, but marks the object in question as being
    initialized so that no warnings will ever be produced for this object.

    In addition to (and in contrast to) __GFP_NOTRACK, the
    __GFP_NOTRACK_FALSE_POSITIVE flag indicates that the allocation should
    not be tracked _because_ it would produce a false positive. Their values
    are identical, but need not be so in the future (for example, we could now
    enable/disable false positives with a config option).

    Parts of this patch were contributed by Pekka Enberg but merged for
    atomicity.

    Signed-off-by: Vegard Nossum
    Signed-off-by: Pekka Enberg
    Signed-off-by: Ingo Molnar

    [rebased for mainline inclusion]
    Signed-off-by: Vegard Nossum

    Vegard Nossum
     

12 Jun, 2009

2 commits

  • As explained by Benjamin Herrenschmidt:

    Oh and btw, your patch alone doesn't fix powerpc, because it's missing
    a whole bunch of GFP_KERNEL's in the arch code... You would have to
    grep the entire kernel for things that check slab_is_available() and
    even then you'll be missing some.

    For example, slab_is_available() didn't always exist, and so in the
    early days on powerpc, we used a mem_init_done global that is set form
    mem_init() (not perfect but works in practice). And we still have code
    using that to do the test.

    Therefore, mask out __GFP_WAIT, __GFP_IO, and __GFP_FS in the slab allocators
    in early boot code to avoid enabling interrupts.

    Signed-off-by: Pekka Enberg

    Pekka Enberg
     
  • This patch adds the callbacks to kmemleak_(alloc|free) functions from
    the slab allocator. The patch also adds the SLAB_NOLEAKTRACE flag to
    avoid recursive calls to kmemleak when it allocates its own data
    structures.

    Signed-off-by: Catalin Marinas
    Reviewed-by: Pekka Enberg

    Catalin Marinas
     

21 Feb, 2009

1 commit

  • kzfree() is a wrapper for kfree() that additionally zeroes the underlying
    memory before releasing it to the slab allocator.

    Currently there is code which memset()s the memory region of an object
    before releasing it back to the slab allocator to make sure
    security-sensitive data are really zeroed out after use.

    These callsites can then just use kzfree() which saves some code, makes
    users greppable and allows for a stupid destructor that isn't necessarily
    aware of the actual object size.

    Signed-off-by: Johannes Weiner
    Reviewed-by: Pekka Enberg
    Cc: Matt Mackall
    Acked-by: Christoph Lameter
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

29 Dec, 2008

2 commits


26 Nov, 2008

1 commit


14 Nov, 2008

1 commit

  • Explain this SLAB_DESTROY_BY_RCU thing...

    [hugh@veritas.com: add a pointer to comment in mm/slab.c]
    Signed-off-by: Peter Zijlstra
    Acked-by: Jens Axboe
    Acked-by: Paul E. McKenney
    Acked-by: Christoph Lameter
    Signed-off-by: Hugh Dickins
    Signed-off-by: Pekka Enberg

    Peter Zijlstra
     

23 Oct, 2008

1 commit


27 Jul, 2008

3 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
    netns: fix ip_rt_frag_needed rt_is_expired
    netfilter: nf_conntrack_extend: avoid unnecessary "ct->ext" dereferences
    netfilter: fix double-free and use-after free
    netfilter: arptables in netns for real
    netfilter: ip{,6}tables_security: fix future section mismatch
    selinux: use nf_register_hooks()
    netfilter: ebtables: use nf_register_hooks()
    Revert "pkt_sched: sch_sfq: dump a real number of flows"
    qeth: use dev->ml_priv instead of dev->priv
    syncookies: Make sure ECN is disabled
    net: drop unused BUG_TRAP()
    net: convert BUG_TRAP to generic WARN_ON
    drivers/net: convert BUG_TRAP to generic WARN_ON

    Linus Torvalds
     
  • As suggested by Patrick McHardy, introduce a __krealloc() that doesn't
    free the original buffer to fix a double-free and use-after-free bug
    introduced by me in netfilter that uses RCU.

    Reported-by: Patrick McHardy
    Signed-off-by: Pekka Enberg
    Tested-by: Dieter Ries
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Pekka Enberg
     
  • Kmem cache passed to constructor is only needed for constructors that are
    themselves multiplexeres. Nobody uses this "feature", nor does anybody uses
    passed kmem cache in non-trivial way, so pass only pointer to object.

    Non-trivial places are:
    arch/powerpc/mm/init_64.c
    arch/powerpc/mm/hugetlbpage.c

    This is flag day, yes.

    Signed-off-by: Alexey Dobriyan
    Acked-by: Pekka Enberg
    Acked-by: Christoph Lameter
    Cc: Jon Tollefson
    Cc: Nick Piggin
    Cc: Matt Mackall
    [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c]
    [akpm@linux-foundation.org: fix mm/slab.c]
    [akpm@linux-foundation.org: fix ubifs]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

25 Jul, 2008

1 commit

  • While in all cases in the kernel we know the size of the elements to be
    created, we don't always know the count of elements. By commuting the size
    and count in the overflow check, the compiler can reduce the runtime division
    of size_t with a compare to a (unique) constant in these cases.

    Signed-off-by: Milton Miller
    Cc: Takashi Iwai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Milton Miller
     

05 Jul, 2008

1 commit

  • Remove all clameter@sgi.com addresses from the kernel tree since they will
    become invalid on June 27th. Change my maintainer email address for the
    slab allocators to cl@linux-foundation.org (which will be the new email
    address for the future).

    Signed-off-by: Christoph Lameter
    Signed-off-by: Christoph Lameter
    Cc: Pekka Enberg
    Cc: Stephen Rothwell
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

07 Jun, 2008

1 commit

  • To get zeroed out memory from a particular NUMA node. To be used by
    sunrpc.

    Signed-off-by: Jeff Layton
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Layton
     

30 Apr, 2008

2 commits


03 Jan, 2008

1 commit

  • Both SLUB and SLAB really did almost exactly the same thing for
    /proc/slabinfo setup, using duplicate code and per-allocator #ifdef's.

    This just creates a common CONFIG_SLABINFO that is enabled by both SLUB
    and SLAB, and shares all the setup code. Maybe SLOB will want this some
    day too.

    Reviewed-by: Pekka Enberg
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

17 Oct, 2007

2 commits

  • Slab constructors currently have a flags parameter that is never used. And
    the order of the arguments is opposite to other slab functions. The object
    pointer is placed before the kmem_cache pointer.

    Convert

    ctor(void *object, struct kmem_cache *s, unsigned long flags)

    to

    ctor(struct kmem_cache *s, void *object)

    throughout the kernel

    [akpm@linux-foundation.org: coupla fixes]
    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • This patch marks a number of allocations that are either short-lived such as
    network buffers or are reclaimable such as inode allocations. When something
    like updatedb is called, long-lived and unmovable kernel allocations tend to
    be spread throughout the address space which increases fragmentation.

    This patch groups these allocations together as much as possible by adding a
    new MIGRATE_TYPE. The MIGRATE_RECLAIMABLE type is for allocations that can be
    reclaimed on demand, but not moved. i.e. they can be migrated by deleting
    them and re-reading the information from elsewhere.

    Signed-off-by: Mel Gorman
    Cc: Andy Whitcroft
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman