27 Jul, 2008

1 commit

  • This patch makes the following needlessly global functions static:
    - percpu_depopulate()
    - __percpu_depopulate_mask()
    - percpu_populate()
    - __percpu_populate_mask()

    Signed-off-by: Adrian Bunk
    Acked-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     

06 Jul, 2008

1 commit


05 Jul, 2008

1 commit

  • Remove all clameter@sgi.com addresses from the kernel tree since they will
    become invalid on June 27th. Change my maintainer email address for the
    slab allocators to cl@linux-foundation.org (which will be the new email
    address for the future).

    Signed-off-by: Christoph Lameter
    Signed-off-by: Christoph Lameter
    Cc: Pekka Enberg
    Cc: Stephen Rothwell
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

24 May, 2008

1 commit


20 Apr, 2008

1 commit

  • * Replace usages of CPU_MASK_NONE, CPU_MASK_ALL, NODE_MASK_NONE,
    NODE_MASK_ALL to reduce stack requirements for large NR_CPUS
    and MAXNODES counts.

    * In some cases, the cpumask variable was initialized but then overwritten
    with another value. This is the case for changes like this:

    - cpumask_t oldmask = CPU_MASK_ALL;
    + cpumask_t oldmask;

    Signed-off-by: Mike Travis
    Signed-off-by: Ingo Molnar

    Mike Travis
     

05 Mar, 2008

1 commit

  • Some oprofile results obtained while using tbench on a 2x2 cpu machine were
    very surprising.

    For example, loopback_xmit() function was using high number of cpu cycles
    to perform the statistic updates, supposed to be real cheap since they use
    percpu data

    pcpu_lstats = netdev_priv(dev);
    lb_stats = per_cpu_ptr(pcpu_lstats, smp_processor_id());
    lb_stats->packets++; /* HERE : serious contention */
    lb_stats->bytes += skb->len;

    struct pcpu_lstats is a small structure containing two longs. It appears
    that on my 32bits platform, alloc_percpu(8) allocates a single cache line,
    instead of giving to each cpu a separate cache line.

    Using the following patch gave me impressive boost in various benchmarks
    ( 6 % in tbench)
    (all percpu_counters hit this bug too)

    Long term fix (ie >= 2.6.26) would be to let each CPU allocate their own
    block of memory, so that we dont need to roudup sizes to L1_CACHE_BYTES, or
    merging the SGI stuff of course...

    Note : SLUB vs SLAB is important here to *show* the improvement, since they
    dont have the same minimum allocation sizes (8 bytes vs 32 bytes). This
    could very well explain regressions some guys reported when they switched
    to SLUB.

    Signed-off-by: Eric Dumazet
    Acked-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     

07 Feb, 2008

1 commit


18 Jul, 2007

1 commit

  • kmalloc_node() and kmem_cache_alloc_node() were not available in a zeroing
    variant in the past. But with __GFP_ZERO it is possible now to do zeroing
    while allocating.

    Use __GFP_ZERO to remove the explicit clearing of memory via memset whereever
    we can.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

08 Dec, 2006

1 commit

  • The patch (as824b) makes percpu_free() ignore NULL arguments, as one would
    expect for a deallocation routine. (Note that free_percpu is #defined as
    percpu_free in include/linux/percpu.h.) A few callers are updated to remove
    now-unneeded tests for NULL. A few other callers already seem to assume
    that passing a NULL pointer to percpu_free() is okay!

    The patch also removes an unnecessary NULL check in percpu_depopulate().

    Signed-off-by: Alan Stern
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Stern
     

26 Sep, 2006

1 commit