15 Jul, 2013

2 commits

  • The __cpuinit type of throwaway sections might have made sense
    some time ago when RAM was more constrained, but now the savings
    do not offset the cost and complications. For example, the fix in
    commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
    is a good example of the nasty type of bugs that can be created
    with improper use of the various __init prefixes.

    After a discussion on LKML[1] it was decided that cpuinit should go
    the way of devinit and be phased out. Once all the users are gone,
    we can then finally remove the macros themselves from linux/init.h.

    This removes all the uses of the __cpuinit macros from C files in
    the core kernel directories (kernel, init, lib, mm, and include)
    that don't really have a specific maintainer.

    [1] https://lkml.org/lkml/2013/5/20/589

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     
  • Pull slab update from Pekka Enberg:
    "Highlights:

    - Fix for boot-time problems on some architectures due to
    init_lock_keys() not respecting kmalloc_caches boundaries
    (Christoph Lameter)

    - CONFIG_SLUB_CPU_PARTIAL requested by RT folks (Joonsoo Kim)

    - Fix for excessive slab freelist draining (Wanpeng Li)

    - SLUB and SLOB cleanups and fixes (various people)"

    I ended up editing the branch, and this avoids two commits at the end
    that were immediately reverted, and I instead just applied the oneliner
    fix in between myself.

    * 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux
    slub: Check for page NULL before doing the node_match check
    mm/slab: Give s_next and s_stop slab-specific names
    slob: Check for NULL pointer before calling ctor()
    slub: Make cpu partial slab support configurable
    slab: add kmalloc() to kernel API documentation
    slab: fix init_lock_keys
    slob: use DIV_ROUND_UP where possible
    slub: do not put a slab to cpu partial list when cpu_partial is 0
    mm/slub: Use node_nr_slabs and node_nr_objs in get_slabinfo
    mm/slub: Drop unnecessary nr_partials
    mm/slab: Fix /proc/slabinfo unwriteable for slab
    mm/slab: Sharing s_next and s_stop between slab and slub
    mm/slab: Fix drain freelist excessively
    slob: Rework #ifdeffery in slab.h
    mm, slab: moved kmem_cache_alloc_node comment to correct place

    Linus Torvalds
     

08 Jul, 2013

1 commit


07 Jul, 2013

3 commits

  • Some architectures (e.g. powerpc built with CONFIG_PPC_256K_PAGES=y
    CONFIG_FORCE_MAX_ZONEORDER=11) get PAGE_SHIFT + MAX_ORDER > 26.

    In 3.10 kernels, CONFIG_LOCKDEP=y with PAGE_SHIFT + MAX_ORDER > 26 makes
    init_lock_keys() dereference beyond kmalloc_caches[26].
    This leads to an unbootable system (kernel panic at initializing SLAB)
    if one of kmalloc_caches[26...PAGE_SHIFT+MAX_ORDER-1] is not NULL.

    Fix this by making sure that init_lock_keys() does not dereference beyond
    kmalloc_caches[26] arrays.

    Signed-off-by: Christoph Lameter
    Reported-by: Tetsuo Handa
    Cc: Pekka Enberg
    Cc: [3.10.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • This patch shares s_next and s_stop between slab and slub.

    Acked-by: Christoph Lameter
    Signed-off-by: Wanpeng Li
    Signed-off-by: Pekka Enberg

    Wanpeng Li
     
  • The drain_freelist is called to drain slabs_free lists for cache reap,
    cache shrink, memory hotplug callback etc. The tofree parameter should
    be the number of slab to free instead of the number of slab objects to
    free.

    This patch fix the callers that pass # of objects. Make sure they pass #
    of slabs.

    Acked-by: Christoph Lameter
    Signed-off-by: Wanpeng Li
    Signed-off-by: Pekka Enberg

    Wanpeng Li
     

08 Jun, 2013

1 commit


07 May, 2013

2 commits

  • Pull slab changes from Pekka Enberg:
    "The bulk of the changes are more slab unification from Christoph.

    There's also few fixes from Aaron, Glauber, and Joonsoo thrown into
    the mix."

    * 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux: (24 commits)
    mm, slab_common: Fix bootstrap creation of kmalloc caches
    slab: Return NULL for oversized allocations
    mm: slab: Verify the nodeid passed to ____cache_alloc_node
    slub: tid must be retrieved from the percpu area of the current processor
    slub: Do not dereference NULL pointer in node_match
    slub: add 'likely' macro to inc_slabs_node()
    slub: correct to calculate num of acquired objects in get_partial_node()
    slub: correctly bootstrap boot caches
    mm/sl[au]b: correct allocation type check in kmalloc_slab()
    slab: Fixup CONFIG_PAGE_ALLOC/DEBUG_SLAB_LEAK sections
    slab: Handle ARCH_DMA_MINALIGN correctly
    slab: Common definition for kmem_cache_node
    slab: Rename list3/l3 to node
    slab: Common Kmalloc cache determination
    stat: Use size_t for sizes instead of unsigned
    slab: Common function to create the kmalloc array
    slab: Common definition for the array of kmalloc caches
    slab: Common constants for kmalloc boundaries
    slab: Rename nodelists to node
    slab: Common name for the per node structures
    ...

    Linus Torvalds
     
  • Pekka Enberg
     

01 May, 2013

1 commit

  • If the nodeid is > num_online_nodes() this can cause an Oops and a
    panic(). The purpose of this patch is to assert if this condition is
    true to aid debugging efforts rather than some random NULL pointer
    dereference or page fault.

    This patch is in response to BZ#42967 [1]. Using VM_BUG_ON so it's used
    only when CONFIG_DEBUG_VM is set, given that ____cache_alloc_node() is a
    hot code path.

    [1]: https://bugzilla.kernel.org/show_bug.cgi?id=42967

    Signed-off-by: Aaron Tomlin
    Reviewed-by: Rik van Riel
    Acked-by: Christoph Lameter
    Acked-by: Rafael Aquini
    Acked-by: David Rientjes
    Signed-off-by: Pekka Enberg

    Aaron Tomlin
     

29 Apr, 2013

1 commit


07 Feb, 2013

1 commit


01 Feb, 2013

8 commits


21 Jan, 2013

1 commit


19 Dec, 2012

6 commits

  • This patch clarifies two aspects of cache attribute propagation.

    First, the expected context for the for_each_memcg_cache macro in
    memcontrol.h. The usages already in the codebase are safe. In mm/slub.c,
    it is trivially safe because the lock is acquired right before the loop.
    In mm/slab.c, it is less so: the lock is acquired by an outer function a
    few steps back in the stack, so a VM_BUG_ON() is added to make sure it is
    indeed safe.

    A comment is also added to detail why we are returning the value of the
    parent cache and ignoring the children's when we propagate the attributes.

    Signed-off-by: Glauber Costa
    Cc: Michal Hocko
    Cc: Kamezawa Hiroyuki
    Cc: Johannes Weiner
    Acked-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Glauber Costa
     
  • SLAB allows us to tune a particular cache behavior with tunables. When
    creating a new memcg cache copy, we'd like to preserve any tunables the
    parent cache already had.

    This could be done by an explicit call to do_tune_cpucache() after the
    cache is created. But this is not very convenient now that the caches are
    created from common code, since this function is SLAB-specific.

    Another method of doing that is taking advantage of the fact that
    do_tune_cpucache() is always called from enable_cpucache(), which is
    called at cache initialization. We can just preset the values, and then
    things work as expected.

    It can also happen that a root cache has its tunables updated during
    normal system operation. In this case, we will propagate the change to
    all caches that are already active.

    This change will require us to move the assignment of root_cache in
    memcg_params a bit earlier. We need this to be already set - which
    memcg_kmem_register_cache will do - when we reach __kmem_cache_create()

    Signed-off-by: Glauber Costa
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Frederic Weisbecker
    Cc: Greg Thelen
    Cc: Johannes Weiner
    Cc: JoonSoo Kim
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Michal Hocko
    Cc: Pekka Enberg
    Cc: Rik van Riel
    Cc: Suleiman Souhlal
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Glauber Costa
     
  • Implement destruction of memcg caches. Right now, only caches where our
    reference counter is the last remaining are deleted. If there are any
    other reference counters around, we just leave the caches lying around
    until they go away.

    When that happens, a destruction function is called from the cache code.
    Caches are only destroyed in process context, so we queue them up for
    later processing in the general case.

    Signed-off-by: Glauber Costa
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Frederic Weisbecker
    Cc: Greg Thelen
    Cc: Johannes Weiner
    Cc: JoonSoo Kim
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Michal Hocko
    Cc: Pekka Enberg
    Cc: Rik van Riel
    Cc: Suleiman Souhlal
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Glauber Costa
     
  • We are able to match a cache allocation to a particular memcg. If the
    task doesn't change groups during the allocation itself - a rare event,
    this will give us a good picture about who is the first group to touch a
    cache page.

    This patch uses the now available infrastructure by calling
    memcg_kmem_get_cache() before all the cache allocations.

    Signed-off-by: Glauber Costa
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Frederic Weisbecker
    Cc: Greg Thelen
    Cc: Johannes Weiner
    Cc: JoonSoo Kim
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Michal Hocko
    Cc: Pekka Enberg
    Cc: Rik van Riel
    Cc: Suleiman Souhlal
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Glauber Costa
     
  • struct page already has this information. If we start chaining caches,
    this information will always be more trustworthy than whatever is passed
    into the function.

    Signed-off-by: Glauber Costa
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Frederic Weisbecker
    Cc: Greg Thelen
    Cc: Johannes Weiner
    Cc: JoonSoo Kim
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Michal Hocko
    Cc: Pekka Enberg
    Cc: Rik van Riel
    Cc: Suleiman Souhlal
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Glauber Costa
     
  • We currently provide lockdep annotation for kmalloc caches, and also
    caches that have SLAB_DEBUG_OBJECTS enabled. The reason for this is that
    we can quite frequently nest in the l3->list_lock lock, which is not
    something trivial to avoid.

    My proposal with this patch, is to extend this to caches whose slab
    management object lives within the slab as well ("on_slab"). The need for
    this arose in the context of testing kmemcg-slab patches. With such
    patchset, we can have per-memcg kmalloc caches. So the same path that led
    to nesting between kmalloc caches will could then lead to in-memcg
    nesting. Because they are not annotated, lockdep will trigger.

    Signed-off-by: Glauber Costa
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Frederic Weisbecker
    Cc: Greg Thelen
    Cc: Johannes Weiner
    Cc: JoonSoo Kim
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Michal Hocko
    Cc: Pekka Enberg
    Cc: Rik van Riel
    Cc: Suleiman Souhlal
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Glauber Costa
     

11 Dec, 2012

4 commits

  • Extract the code to do object alignment from the allocators.
    Do the alignment calculations in slab_common so that the
    __kmem_cache_create functions of the allocators do not have
    to deal with alignment.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • Simplify setup and reduce code in kmem_cache_init(). This allows us to
    get rid of initarray_cache as well as the manual setup code for
    the kmem_cache and kmem_cache_node arrays during bootstrap.

    We introduce a new bootstrap state "PARTIAL" for slab that signals the
    creation of a kmem_cache boot cache.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • Use a special function to create kmalloc caches and use that function in
    SLAB and SLUB.

    Acked-by: Joonsoo Kim
    Reviewed-by: Glauber Costa
    Acked-by: David Rientjes
    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • The nodelists field in kmem_cache is pointing to the first unused
    object in the array field when bootstrap is complete.

    A problem with the current approach is that the statically sized
    kmem_cache structure use on boot can only contain NR_CPUS entries.
    If the number of nodes plus the number of cpus is greater then we
    would overwrite memory following the kmem_cache_boot definition.

    Increase the size of the array field to ensure that also the node
    pointers fit into the array field.

    Once we do that we no longer need the kmem_cache_nodelists
    array and we can then also use that structure elsewhere.

    Acked-by: Glauber Costa
    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     

15 Nov, 2012

1 commit

  • Fix new kernel-doc warnings in mm/slab.c:

    Warning(mm/slab.c:2358): No description found for parameter 'cachep'
    Warning(mm/slab.c:2358): Excess function parameter 'name' description in '__kmem_cache_create'
    Warning(mm/slab.c:2358): Excess function parameter 'size' description in '__kmem_cache_create'
    Warning(mm/slab.c:2358): Excess function parameter 'align' description in '__kmem_cache_create'
    Warning(mm/slab.c:2358): Excess function parameter 'ctor' description in '__kmem_cache_create'

    Signed-off-by: Randy Dunlap
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: Matt Mackall
    Signed-off-by: Pekka Enberg

    Randy Dunlap
     

31 Oct, 2012

2 commits

  • Some flags are used internally by the allocators for management
    purposes. One example of that is the CFLGS_OFF_SLAB flag that slab uses
    to mark that the metadata for that cache is stored outside of the slab.

    No cache should ever pass those as a creation flags. We can just ignore
    this bit if it happens to be passed (such as when duplicating a cache in
    the kmem memcg patches).

    Because such flags can vary from allocator to allocator, we allow them
    to make their own decisions on that, defining SLAB_AVAILABLE_FLAGS with
    all flags that are valid at creation time. Allocators that doesn't have
    any specific flag requirement should define that to mean all flags.

    Common code will mask out all flags not belonging to that set.

    Acked-by: Christoph Lameter
    Acked-by: David Rientjes
    Signed-off-by: Glauber Costa
    Signed-off-by: Pekka Enberg

    Glauber Costa
     
  • This function is identically defined in all three allocators
    and it's trivial to move it to slab.h

    Since now it's static, inline, header-defined function
    this patch also drops the EXPORT_SYMBOL tag.

    Cc: Pekka Enberg
    Cc: Matt Mackall
    Acked-by: Christoph Lameter
    Signed-off-by: Ezequiel Garcia
    Signed-off-by: Pekka Enberg

    Ezequiel Garcia
     

24 Oct, 2012

3 commits

  • With all the infrastructure in place, we can now have slabinfo_show
    done from slab_common.c. A cache-specific function is called to grab
    information about the cache itself, since that is still heavily
    dependent on the implementation. But with the values produced by it, all
    the printing and handling is done from common code.

    Signed-off-by: Glauber Costa
    CC: Christoph Lameter
    CC: David Rientjes
    Signed-off-by: Pekka Enberg

    Glauber Costa
     
  • The header format is highly similar between slab and slub. The main
    difference lays in the fact that slab may optionally have statistics
    added here in case of CONFIG_SLAB_DEBUG, while the slub will stick them
    somewhere else.

    By making sure that information conditionally lives inside a
    globally-visible CONFIG_DEBUG_SLAB switch, we can move the header
    printing to a common location.

    Signed-off-by: Glauber Costa
    Acked-by: Christoph Lameter
    CC: David Rientjes
    Signed-off-by: Pekka Enberg

    Glauber Costa
     
  • This patch moves all the common machinery to slabinfo processing
    to slab_common.c. We can do better by noticing that the output is
    heavily common, and having the allocators to just provide finished
    information about this. But after this first step, this can be done
    easier.

    Signed-off-by: Glauber Costa
    Acked-by: Christoph Lameter
    CC: David Rientjes
    Signed-off-by: Pekka Enberg

    Glauber Costa
     

07 Oct, 2012

1 commit

  • Pull SLAB changes from Pekka Enberg:
    "New and noteworthy:

    * More SLAB allocator unification patches from Christoph Lameter and
    others. This paves the way for slab memcg patches that hopefully
    will land in v3.8.

    * SLAB tracing improvements from Ezequiel Garcia.

    * Kernel tainting upon SLAB corruption from Dave Jones.

    * Miscellanous SLAB allocator bug fixes and improvements from various
    people."

    * 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux: (43 commits)
    slab: Fix build failure in __kmem_cache_create()
    slub: init_kmem_cache_cpus() and put_cpu_partial() can be static
    mm/slab: Fix kmem_cache_alloc_node_trace() declaration
    Revert "mm/slab: Fix kmem_cache_alloc_node_trace() declaration"
    mm, slob: fix build breakage in __kmalloc_node_track_caller
    mm/slab: Fix kmem_cache_alloc_node_trace() declaration
    mm/slab: Fix typo _RET_IP -> _RET_IP_
    mm, slub: Rename slab_alloc() -> slab_alloc_node() to match SLAB
    mm, slab: Rename __cache_alloc() -> slab_alloc()
    mm, slab: Match SLAB and SLUB kmem_cache_alloc_xxx_trace() prototype
    mm, slab: Replace 'caller' type, void* -> unsigned long
    mm, slob: Add support for kmalloc_track_caller()
    mm, slab: Remove silly function slab_buffer_size()
    mm, slob: Use NUMA_NO_NODE instead of -1
    mm, sl[au]b: Taint kernel when we detect a corrupted slab
    slab: Only define slab_error for DEBUG
    slab: fix the DEADLOCK issue on l3 alien lock
    slub: Zero initial memory segment for kmem_cache and kmem_cache_node
    Revert "mm/sl[aou]b: Move sysfs_slab_add to common"
    mm/sl[aou]b: Move kmem_cache refcounting to common code
    ...

    Linus Torvalds
     

03 Oct, 2012

2 commits