24 Jun, 2009

1 commit

  • This patch makes most !CONFIG_HAVE_SETUP_PER_CPU_AREA archs use
    dynamic percpu allocator. The first chunk is allocated using
    embedding helper and 8k is reserved for modules. This ensures that
    the new allocator behaves almost identically to the original allocator
    as long as static percpu variables are concerned, so it shouldn't
    introduce much breakage.

    s390 and alpha use custom SHIFT_PERCPU_PTR() to work around addressing
    range limit the addressing model imposes. Unfortunately, this breaks
    if the address is specified using a variable, so for now, the two
    archs aren't converted.

    The following architectures are affected by this change.

    * sh
    * arm
    * cris
    * mips
    * sparc(32)
    * blackfin
    * avr32
    * parisc (broken, under investigation)
    * m32r
    * powerpc(32)

    As this change makes the dynamic allocator the default one,
    CONFIG_HAVE_DYNAMIC_PER_CPU_AREA is replaced with its invert -
    CONFIG_HAVE_LEGACY_PER_CPU_AREA, which is added to yet-to-be converted
    archs. These archs implement their own setup_per_cpu_areas() and the
    conversion is not trivial.

    * powerpc(64)
    * sparc(64)
    * ia64
    * alpha
    * s390

    Boot and batch alloc/free tests on x86_32 with debug code (x86_32
    doesn't use default first chunk initialization). Compile tested on
    sparc(32), powerpc(32), arm and alpha.

    Kyle McMartin reported that this change breaks parisc. The problem is
    still under investigation and he is okay with pushing this patch
    forward and fixing parisc later.

    [ Impact: use dynamic allocator for most archs w/o custom percpu setup ]

    Signed-off-by: Tejun Heo
    Acked-by: Rusty Russell
    Acked-by: David S. Miller
    Acked-by: Benjamin Herrenschmidt
    Acked-by: Martin Schwidefsky
    Reviewed-by: Christoph Lameter
    Cc: Paul Mundt
    Cc: Russell King
    Cc: Mikael Starvik
    Cc: Ralf Baechle
    Cc: Bryan Wu
    Cc: Kyle McMartin
    Cc: Matthew Wilcox
    Cc: Grant Grundler
    Cc: Hirokazu Takata
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Heiko Carstens
    Cc: Ingo Molnar

    Tejun Heo
     

07 Apr, 2009

1 commit


30 Mar, 2009

1 commit


11 Mar, 2009

1 commit

  • Impact: remove spurious WARN on legacy SMP percpu allocator

    Commit f2a8205c4ef1af917d175c36a4097ae5587791c8 incorrectly added too
    tight WARN_ON_ONCE() on alignments for UP and legacy SMP percpu
    allocator. Commit e317603694bfd17b28a40de9d65e1a4ec12f816e fixed it
    for UP but legacy SMP allocator was forgotten. Fix it.

    Signed-off-by: Tejun Heo
    Reported-by: Sachin P. Sant

    Tejun Heo
     

20 Feb, 2009

1 commit

  • Impact: kill unused functions

    percpu_alloc() and its friends never saw much action. It was supposed
    to replace the cpu-mask unaware __alloc_percpu() but it never happened
    and in fact __percpu_alloc_mask() itself never really grew proper
    up/down handling interface either (no exported interface for
    populate/depopulate).

    percpu allocation is about to go through major reimplementation and
    there's no reason to carry this unused interface around. Replace it
    with __alloc_percpu() and free_percpu().

    Signed-off-by: Tejun Heo

    Tejun Heo
     

27 Jul, 2008

1 commit

  • This patch makes the following needlessly global functions static:
    - percpu_depopulate()
    - __percpu_depopulate_mask()
    - percpu_populate()
    - __percpu_populate_mask()

    Signed-off-by: Adrian Bunk
    Acked-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     

06 Jul, 2008

1 commit


05 Jul, 2008

1 commit

  • Remove all clameter@sgi.com addresses from the kernel tree since they will
    become invalid on June 27th. Change my maintainer email address for the
    slab allocators to cl@linux-foundation.org (which will be the new email
    address for the future).

    Signed-off-by: Christoph Lameter
    Signed-off-by: Christoph Lameter
    Cc: Pekka Enberg
    Cc: Stephen Rothwell
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

24 May, 2008

1 commit


20 Apr, 2008

1 commit

  • * Replace usages of CPU_MASK_NONE, CPU_MASK_ALL, NODE_MASK_NONE,
    NODE_MASK_ALL to reduce stack requirements for large NR_CPUS
    and MAXNODES counts.

    * In some cases, the cpumask variable was initialized but then overwritten
    with another value. This is the case for changes like this:

    - cpumask_t oldmask = CPU_MASK_ALL;
    + cpumask_t oldmask;

    Signed-off-by: Mike Travis
    Signed-off-by: Ingo Molnar

    Mike Travis
     

05 Mar, 2008

1 commit

  • Some oprofile results obtained while using tbench on a 2x2 cpu machine were
    very surprising.

    For example, loopback_xmit() function was using high number of cpu cycles
    to perform the statistic updates, supposed to be real cheap since they use
    percpu data

    pcpu_lstats = netdev_priv(dev);
    lb_stats = per_cpu_ptr(pcpu_lstats, smp_processor_id());
    lb_stats->packets++; /* HERE : serious contention */
    lb_stats->bytes += skb->len;

    struct pcpu_lstats is a small structure containing two longs. It appears
    that on my 32bits platform, alloc_percpu(8) allocates a single cache line,
    instead of giving to each cpu a separate cache line.

    Using the following patch gave me impressive boost in various benchmarks
    ( 6 % in tbench)
    (all percpu_counters hit this bug too)

    Long term fix (ie >= 2.6.26) would be to let each CPU allocate their own
    block of memory, so that we dont need to roudup sizes to L1_CACHE_BYTES, or
    merging the SGI stuff of course...

    Note : SLUB vs SLAB is important here to *show* the improvement, since they
    dont have the same minimum allocation sizes (8 bytes vs 32 bytes). This
    could very well explain regressions some guys reported when they switched
    to SLUB.

    Signed-off-by: Eric Dumazet
    Acked-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     

07 Feb, 2008

1 commit


18 Jul, 2007

1 commit

  • kmalloc_node() and kmem_cache_alloc_node() were not available in a zeroing
    variant in the past. But with __GFP_ZERO it is possible now to do zeroing
    while allocating.

    Use __GFP_ZERO to remove the explicit clearing of memory via memset whereever
    we can.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

08 Dec, 2006

1 commit

  • The patch (as824b) makes percpu_free() ignore NULL arguments, as one would
    expect for a deallocation routine. (Note that free_percpu is #defined as
    percpu_free in include/linux/percpu.h.) A few callers are updated to remove
    now-unneeded tests for NULL. A few other callers already seem to assume
    that passing a NULL pointer to percpu_free() is okay!

    The patch also removes an unnecessary NULL check in percpu_depopulate().

    Signed-off-by: Alan Stern
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Stern
     

26 Sep, 2006

1 commit