01 Jan, 2009

1 commit

  • Impact: Use new API

    Convert kernel mm functions to use struct cpumask.

    We skip include/linux/percpu.h and mm/allocpercpu.c, which are in flux.

    Signed-off-by: Rusty Russell
    Signed-off-by: Mike Travis
    Reviewed-by: Christoph Lameter

    Rusty Russell
     

29 Dec, 2008

4 commits


26 Nov, 2008

3 commits


23 Oct, 2008

2 commits


30 Jul, 2008

1 commit


27 Jul, 2008

1 commit

  • Kmem cache passed to constructor is only needed for constructors that are
    themselves multiplexeres. Nobody uses this "feature", nor does anybody uses
    passed kmem cache in non-trivial way, so pass only pointer to object.

    Non-trivial places are:
    arch/powerpc/mm/init_64.c
    arch/powerpc/mm/hugetlbpage.c

    This is flag day, yes.

    Signed-off-by: Alexey Dobriyan
    Acked-by: Pekka Enberg
    Acked-by: Christoph Lameter
    Cc: Jon Tollefson
    Cc: Nick Piggin
    Cc: Matt Mackall
    [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c]
    [akpm@linux-foundation.org: fix mm/slab.c]
    [akpm@linux-foundation.org: fix ubifs]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

16 Jul, 2008

2 commits

  • Conflicts:

    arch/powerpc/Kconfig
    arch/s390/kernel/time.c
    arch/x86/kernel/apic_32.c
    arch/x86/kernel/cpu/perfctr-watchdog.c
    arch/x86/kernel/i8259_64.c
    arch/x86/kernel/ldt.c
    arch/x86/kernel/nmi_64.c
    arch/x86/kernel/smpboot.c
    arch/x86/xen/smp.c
    include/asm-x86/hw_irq_32.h
    include/asm-x86/hw_irq_64.h
    include/asm-x86/mach-default/irq_vectors.h
    include/asm-x86/mach-voyager/irq_vectors.h
    include/asm-x86/smp.h
    kernel/Makefile

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • With the removal of destructors, slab_destroy_objs no longer actually
    destroys any objects, making the kernel doc incorrect and the function
    name misleading.

    In keeping with the other debug functions, rename it to
    slab_destroy_debugcheck and drop the kernel doc.

    Signed-off-by: Rabin Vincent
    Signed-off-by: Pekka Enberg

    Rabin Vincent
     

26 Jun, 2008

1 commit


22 Jun, 2008

1 commit

  • The zonelist patches caused the loop that checks for available
    objects in permitted zones to not terminate immediately. One object
    per zone per allocation may be allocated and then abandoned.

    Break the loop when we have successfully allocated one object.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

30 Apr, 2008

2 commits

  • __FUNCTION__ is gcc-specific, use __func__

    Signed-off-by: Harvey Harrison
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Harvey Harrison
     
  • We can see an ever repeating problem pattern with objects of any kind in the
    kernel:

    1) freeing of active objects
    2) reinitialization of active objects

    Both problems can be hard to debug because the crash happens at a point where
    we have no chance to decode the root cause anymore. One problem spot are
    kernel timers, where the detection of the problem often happens in interrupt
    context and usually causes the machine to panic.

    While working on a timer related bug report I had to hack specialized code
    into the timer subsystem to get a reasonable hint for the root cause. This
    debug hack was fine for temporary use, but far from a mergeable solution due
    to the intrusiveness into the timer code.

    The code further lacked the ability to detect and report the root cause
    instantly and keep the system operational.

    Keeping the system operational is important to get hold of the debug
    information without special debugging aids like serial consoles and special
    knowledge of the bug reporter.

    The problems described above are not restricted to timers, but timers tend to
    expose it usually in a full system crash. Other objects are less explosive,
    but the symptoms caused by such mistakes can be even harder to debug.

    Instead of creating specialized debugging code for the timer subsystem a
    generic infrastructure is created which allows developers to verify their code
    and provides an easy to enable debug facility for users in case of trouble.

    The debugobjects core code keeps track of operations on static and dynamic
    objects by inserting them into a hashed list and sanity checking them on
    object operations and provides additional checks whenever kernel memory is
    freed.

    The tracked object operations are:
    - initializing an object
    - adding an object to a subsystem list
    - deleting an object from a subsystem list

    Each operation is sanity checked before the operation is executed and the
    subsystem specific code can provide a fixup function which allows to prevent
    the damage of the operation. When the sanity check triggers a warning message
    and a stack trace is printed.

    The list of operations can be extended if the need arises. For now it's
    limited to the requirements of the first user (timers).

    The core code enqueues the objects into hash buckets. The hash index is
    generated from the address of the object to simplify the lookup for the check
    on kfree/vfree. Each bucket has it's own spinlock to avoid contention on a
    global lock.

    The debug code can be compiled in without being active. The runtime overhead
    is minimal and could be optimized by asm alternatives. A kernel command line
    option enables the debugging code.

    Thanks to Ingo Molnar for review, suggestions and cleanup patches.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: Greg KH
    Cc: Randy Dunlap
    Cc: Kay Sievers
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     

28 Apr, 2008

4 commits

  • Not all architectures define cache_line_size() so as suggested by Andrew move
    the private implementations in mm/slab.c and mm/slob.c to .

    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: H. Peter Anvin
    Reviewed-by: Christoph Lameter
    Signed-off-by: Pekka Enberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pekka Enberg
     
  • Filtering zonelists requires very frequent use of zone_idx(). This is costly
    as it involves a lookup of another structure and a substraction operation. As
    the zone_idx is often required, it should be quickly accessible. The node idx
    could also be stored here if it was found that accessing zone->node is
    significant which may be the case on workloads where nodemasks are heavily
    used.

    This patch introduces a struct zoneref to store a zone pointer and a zone
    index. The zonelist then consists of an array of these struct zonerefs which
    are looked up as necessary. Helpers are given for accessing the zone index as
    well as the node index.

    [kamezawa.hiroyu@jp.fujitsu.com: Suggested struct zoneref instead of embedding information in pointers]
    [hugh@veritas.com: mm-have-zonelist: fix memcg ooms]
    [hugh@veritas.com: just return do_try_to_free_pages]
    [hugh@veritas.com: do_try_to_free_pages gfp_mask redundant]
    Signed-off-by: Mel Gorman
    Acked-by: Christoph Lameter
    Acked-by: David Rientjes
    Signed-off-by: Lee Schermerhorn
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Christoph Lameter
    Cc: Nick Piggin
    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • Currently a node has two sets of zonelists, one for each zone type in the
    system and a second set for GFP_THISNODE allocations. Based on the zones
    allowed by a gfp mask, one of these zonelists is selected. All of these
    zonelists consume memory and occupy cache lines.

    This patch replaces the multiple zonelists per-node with two zonelists. The
    first contains all populated zones in the system, ordered by distance, for
    fallback allocations when the target/preferred node has no free pages. The
    second contains all populated zones in the node suitable for GFP_THISNODE
    allocations.

    An iterator macro is introduced called for_each_zone_zonelist() that interates
    through each zone allowed by the GFP flags in the selected zonelist.

    Signed-off-by: Mel Gorman
    Acked-by: Christoph Lameter
    Signed-off-by: Lee Schermerhorn
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Christoph Lameter
    Cc: Hugh Dickins
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • Introduce a node_zonelist() helper function. It is used to lookup the
    appropriate zonelist given a node and a GFP mask. The patch on its own is a
    cleanup but it helps clarify parts of the two-zonelist-per-node patchset. If
    necessary, it can be merged with the next patch in this set without problems.

    Reviewed-by: Christoph Lameter
    Signed-off-by: Mel Gorman
    Signed-off-by: Lee Schermerhorn
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Christoph Lameter
    Cc: Hugh Dickins
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     

20 Apr, 2008

1 commit

  • * Use new node_to_cpumask_ptr. This creates a pointer to the
    cpumask for a given node. This definition is in mm patch:

    asm-generic-add-node_to_cpumask_ptr-macro.patch

    * Use new set_cpus_allowed_ptr function.

    Depends on:
    [mm-patch]: asm-generic-add-node_to_cpumask_ptr-macro.patch
    [sched-devel]: sched: add new set_cpus_allowed_ptr function
    [x86/latest]: x86: add cpus_scnprintf function

    Cc: Greg Kroah-Hartman
    Cc: Greg Banks
    Cc: H. Peter Anvin
    Signed-off-by: Mike Travis
    Signed-off-by: Ingo Molnar

    Mike Travis
     

27 Mar, 2008

1 commit

  • Commit 556a169dab38b5100df6f4a45b655dddd3db94c1 ("slab: fix bootstrap on
    memoryless node") introduced bootstrap-time cache_cache list3s for all nodes
    but forgot that initkmem_list3 needs to be accessed by [somevalue + node]. This
    patch fixes list_add() corruption in mm/slab.c seen on the ES7000.

    Cc: Mel Gorman
    Cc: Olaf Hering
    Cc: Christoph Lameter
    Signed-off-by: Dan Yeisley
    Signed-off-by: Pekka Enberg
    Signed-off-by: Christoph Lameter

    Daniel Yeisley
     

20 Mar, 2008

1 commit

  • Fix various kernel-doc notation in mm/:

    filemap.c: add function short description; convert 2 to kernel-doc
    fremap.c: change parameter 'prot' to @prot
    pagewalk.c: change "-" in function parameters to ":"
    slab.c: fix short description of kmem_ptr_validate()
    swap.c: fix description & parameters of put_pages_list()
    swap_state.c: fix function parameters
    vmalloc.c: change "@returns" to "Returns:" since that is not a parameter

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

07 Mar, 2008

3 commits

  • NUMA slab allocator cpu migration bugfix

    The NUMA slab allocator (specifically, cache_alloc_refill)
    is not refreshing its local copies of what cpu and what
    numa node it is on, when it drops and reacquires the irq
    block that it inherited from its caller. As a result
    those values become invalid if an attempt to migrate the
    process to another numa node occured while the irq block
    had been dropped.

    The solution is to make cache_alloc_refill reload these
    variables whenever it drops and reacquires the irq block.

    The error is very difficult to hit. When it does occur,
    one gets the following oops + stack traceback bits in
    check_spinlock_acquired:

    kernel BUG at mm/slab.c:2417
    cache_alloc_refill+0xe6
    kmem_cache_alloc+0xd0
    ...

    This patch was developed against 2.6.23, ported to and
    compiled-tested only against 2.6.25-rc4.

    Signed-off-by: Joe Korty
    Signed-off-by: Christoph Lameter

    Joe Korty
     
  • Make them all use angle brackets and the directory name.

    Acked-by: Pekka Enberg
    Signed-off-by: Joe Perches
    Signed-off-by: Christoph Lameter

    Joe Perches
     
  • The NUMA fallback logic should be passing local_flags to kmem_get_pages() and not simply the
    flags passed in.

    Reviewed-by: Pekka Enberg
    Signed-off-by: Christoph Lameter

    Christoph Lameter
     

15 Feb, 2008

1 commit


26 Jan, 2008

2 commits

  • This patch converts the known per-subsystem mutexes to get_online_cpus
    put_online_cpus. It also eliminates the CPU_LOCK_ACQUIRE and
    CPU_LOCK_RELEASE hotplug notification events.

    Signed-off-by: Gautham R Shenoy
    Signed-off-by: Ingo Molnar

    Gautham R Shenoy
     
  • If the node we're booting on doesn't have memory, bootstrapping kmalloc()
    caches resorts to fallback_alloc() which requires ->nodelists set for all
    nodes. Fix that by calling set_up_list3s() for CACHE_CACHE in
    kmem_cache_init().

    As kmem_getpages() is called with GFP_THISNODE set, this used to work before
    because of breakage in 2.6.22 and before with GFP_THISNODE returning pages from
    the wrong node if a node had no memory. So it may have worked accidentally and
    in an unsafe manner because the pages would have been associated with the wrong
    node which could trigger bug ons and locking troubles.

    Tested-by: Mel Gorman
    Tested-by: Olaf Hering
    Reviewed-by: Christoph Lameter
    Signed-off-by: Pekka Enberg
    [ With additional one-liner by Olaf Hering - Linus ]
    Signed-off-by: Linus Torvalds

    Pekka Enberg
     

25 Jan, 2008

1 commit

  • Partial revert the changes made by 04231b3002ac53f8a64a7bd142fde3fa4b6808c6
    to the kmem_list3 management. On a machine with a memoryless node, this
    BUG_ON was triggering

    static void *____cache_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid)
    {
    struct list_head *entry;
    struct slab *slabp;
    struct kmem_list3 *l3;
    void *obj;
    int x;

    l3 = cachep->nodelists[nodeid];
    BUG_ON(!l3);

    Signed-off-by: Mel Gorman
    Cc: Pekka Enberg
    Acked-by: Christoph Lameter
    Cc: "Aneesh Kumar K.V"
    Cc: Nishanth Aravamudan
    Cc: KAMEZAWA Hiroyuki
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     

03 Jan, 2008

1 commit

  • Both SLUB and SLAB really did almost exactly the same thing for
    /proc/slabinfo setup, using duplicate code and per-allocator #ifdef's.

    This just creates a common CONFIG_SLABINFO that is enabled by both SLUB
    and SLAB, and shares all the setup code. Maybe SLOB will want this some
    day too.

    Reviewed-by: Pekka Enberg
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

06 Dec, 2007

1 commit

  • mm/slub.c exports ksize(), but mm/slob.c and mm/slab.c don't.

    It's used by binfmt_flat, which can be built as a module.

    Signed-off-by: Tetsuo Handa
    Cc: Christoph Lameter
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tetsuo Handa
     

01 Dec, 2007

1 commit

  • The database performance group have found that half the cycles spent
    in kmem_cache_free are spent in this one call to BUG_ON. Moving it
    into the CONFIG_SLAB_DEBUG-only function cache_free_debugcheck() is a
    performance win of almost 0.5% on their particular benchmark.

    The call was added as part of commit ddc2e812d592457747c4367fb73edcaa8e1e49ff
    with the comment that "overhead should be minimal". It may have been
    minimal at the time, but it isn't now.

    [ Quoth Pekka Enberg: "I don't think the BUG_ON per se caused the
    performance regression but rather the virt_to_head_page() changes to
    virt_to_cache() that were added later." ]

    Signed-off-by: Matthew Wilcox
    Acked-by: Pekka J Enberg
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     

15 Nov, 2007

1 commit


20 Oct, 2007

1 commit


19 Oct, 2007

2 commits

  • This patch fixes memory leak in error path.

    In reality, we don't need to call cpuup_canceled(cpu) for now. But upcoming
    cpu hotplug error handling change needs this.

    Cc: Christoph Lameter
    Cc: Gautham R Shenoy
    Acked-by: Pekka Enberg
    Signed-off-by: Akinobu Mita
    Cc: Gautham R Shenoy
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • cpuup_callback() is too long. This patch factors out CPU_UP_CANCELLED and
    CPU_UP_PREPARE handlings from cpuup_callback().

    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Signed-off-by: Akinobu Mita
    Cc: Gautham R Shenoy
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     

17 Oct, 2007

1 commit