22 Apr, 2009

2 commits

  • Collect the DECLARE/DEFINE declarations together in linux/percpu-defs.h so
    that they're in one place, and give them descriptive comments, particularly
    the SHARED_ALIGNED variant.

    It would be nice to collect these in linux/percpu.h, but that's not possible
    without sorting out the severe #include recursion between the x86 arch headers
    and the general headers (and possibly other arches too).

    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    David Howells
     
  • In non-SMP mode, the variable section attribute specified by DECLARE_PER_CPU()
    does not agree with that specified by DEFINE_PER_CPU(). This means that
    architectures that have a small data section references relative to a base
    register may throw up linkage errors due to too great a displacement between
    where the base register points and the per-CPU variable.

    On FRV, the .h declaration says that the variable is in the .sdata section, but
    the .c definition says it's actually in the .data section. The linker throws
    up the following errors:

    kernel/built-in.o: In function `release_task':
    kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o
    kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o

    To fix this, DECLARE_PER_CPU() should simply apply the same section attribute
    as does DEFINE_PER_CPU(). However, this is made slightly more complex by
    virtue of the fact that there are several variants on DEFINE, so these need to
    be matched by variants on DECLARE.

    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    David Howells
     

11 Apr, 2009

1 commit

  • For the time being, move the generic percpu_*() accessors to
    linux/percpu.h.

    asm-generic/percpu.h is meant to carry generic stuff for low level
    stuff - declarations, definitions and pointer offset calculation
    and so on but not for generic interface.

    Signed-off-by: Ingo Molnar

    Tejun Heo
     

16 Jan, 2009

1 commit

  • It is an optimization and a cleanup, and adds the following new
    generic percpu methods:

    percpu_read()
    percpu_write()
    percpu_add()
    percpu_sub()
    percpu_and()
    percpu_or()
    percpu_xor()

    and implements support for them on x86. (other architectures will fall
    back to a default implementation)

    The advantage is that for example to read a local percpu variable,
    instead of this sequence:

    return __get_cpu_var(var);

    ffffffff8102ca2b: 48 8b 14 fd 80 09 74 mov -0x7e8bf680(,%rdi,8),%rdx
    ffffffff8102ca32: 81
    ffffffff8102ca33: 48 c7 c0 d8 59 00 00 mov $0x59d8,%rax
    ffffffff8102ca3a: 48 8b 04 10 mov (%rax,%rdx,1),%rax

    We can get a single instruction by using the optimized variants:

    return percpu_read(var);

    ffffffff8102ca3f: 65 48 8b 05 91 8f fd mov %gs:0x7efd8f91(%rip),%rax

    I also cleaned up the x86-specific APIs and made the x86 code use
    these new generic percpu primitives.

    tj: * fixed generic percpu_sub() definition as Roel Kluin pointed out
    * added percpu_and() for completeness's sake
    * made generic percpu ops atomic against preemption

    Signed-off-by: Ingo Molnar
    Signed-off-by: Tejun Heo

    Ingo Molnar
     

24 Feb, 2008

1 commit

  • 2.6.25-rc1 percpu changes broke CONFIG_DEBUG_PREEMPT's per_cpu checking
    on several architectures. On s390, sparc64 and x86 it's been weakened to
    not checking at all; whereas on powerpc64 it's become too strict, issuing
    warnings from __raw_get_cpu_var in io_schedule and init_timer for example.

    Fix this by weakening powerpc's __my_cpu_offset to use the non-checking
    local_paca instead of get_paca (which itself contains such a check);
    and strengthening the generic my_cpu_offset to go the old slow way via
    smp_processor_id when CONFIG_DEBUG_PREEMPT (debug_smp_processor_id is
    where all the knowledge of what's correct when lives).

    Signed-off-by: Hugh Dickins
    Reviewed-by: Mike Travis
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

30 Jan, 2008

4 commits

  • Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Mike Travis
     
  • - add support for PER_CPU_ATTRIBUTES

    - fix generic smp percpu_modcopy to use per_cpu_offset() macro.

    Add the ability to use generic/percpu even if the arch needs to override
    several aspects of its operations. This will enable the use of generic
    percpu.h for all arches.

    An arch may define:

    __per_cpu_offset Do not use the generic pointer array. Arch must
    define per_cpu_offset(cpu) (used by x86_64, s390).

    __my_cpu_offset Can be defined to provide an optimized way to determine
    the offset for variables of the currently executing
    processor. Used by ia64, x86_64, x86_32, sparc64, s/390.

    SHIFT_PTR(ptr, offset) If an arch defines it then special handling
    of pointer arithmentic may be implemented. Used
    by s/390.

    (Some of these special percpu arch implementations may be later consolidated
    so that there are less cases to deal with.)

    Cc: Rusty Russell
    Cc: Andi Kleen
    Signed-off-by: Christoph Lameter
    Signed-off-by: Mike Travis
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    travis@sgi.com
     
  • - Special consideration for IA64: Add the ability to specify
    arch specific per cpu flags

    - remove .data.percpu attribute from DEFINE_PER_CPU for non-smp case.

    The arch definitions are all the same. So move them into linux/percpu.h.

    We cannot move DECLARE_PER_CPU since some include files just include
    asm/percpu.h to avoid include recursion problems.

    Cc: Rusty Russell
    Cc: Andi Kleen
    Signed-off-by: Christoph Lameter
    Signed-off-by: Mike Travis
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    travis@sgi.com
     
  • The use of the __GENERIC_PERCPU is a bit problematic since arches
    may want to run their own percpu setup while using the generic
    percpu definitions. Replace it through a kconfig variable.

    Cc: Rusty Russell
    Cc: Andi Kleen
    Signed-off-by: Christoph Lameter
    Signed-off-by: Mike Travis
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    travis@sgi.com
     

20 Jul, 2007

1 commit

  • per cpu data section contains two types of data. One set which is
    exclusively accessed by the local cpu and the other set which is per cpu,
    but also shared by remote cpus. In the current kernel, these two sets are
    not clearely separated out. This can potentially cause the same data
    cacheline shared between the two sets of data, which will result in
    unnecessary bouncing of the cacheline between cpus.

    One way to fix the problem is to cacheline align the remotely accessed per
    cpu data, both at the beginning and at the end. Because of the padding at
    both ends, this will likely cause some memory wastage and also the
    interface to achieve this is not clean.

    This patch:

    Moves the remotely accessed per cpu data (which is currently marked
    as ____cacheline_aligned_in_smp) into a different section, where all the data
    elements are cacheline aligned. And as such, this differentiates the local
    only data and remotely accessed data cleanly.

    Signed-off-by: Fenghua Yu
    Acked-by: Suresh Siddha
    Cc: Rusty Russell
    Cc: Christoph Lameter
    Cc:
    Cc: "Luck, Tony"
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fenghua Yu
     

03 May, 2007

1 commit

  • Allocating PDA and GDT at boot is a pain. Using simple per-cpu variables adds
    happiness (although we need the GDT page-aligned for Xen, which we do in a
    followup patch).

    [akpm@linux-foundation.org: build fix]
    Signed-off-by: Rusty Russell
    Signed-off-by: Andi Kleen
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton

    Rusty Russell
     

06 Oct, 2006

1 commit


26 Sep, 2006

1 commit


04 Jul, 2006

1 commit


26 Jun, 2006

1 commit

  • There are several instances of per_cpu(foo, raw_smp_processor_id()), which
    is semantically equivalent to __get_cpu_var(foo) but without the warning
    that smp_processor_id() can give if CONFIG_DEBUG_PREEMPT is enabled. For
    those architectures with optimized per-cpu implementations, namely ia64,
    powerpc, s390, sparc64 and x86_64, per_cpu() turns into more and slower
    code than __get_cpu_var(), so it would be preferable to use __get_cpu_var
    on those platforms.

    This defines a __raw_get_cpu_var(x) macro which turns into per_cpu(x,
    raw_smp_processor_id()) on architectures that use the generic per-cpu
    implementation, and turns into __get_cpu_var(x) on the architectures that
    have an optimized per-cpu implementation.

    Signed-off-by: Paul Mackerras
    Acked-by: David S. Miller
    Acked-by: Ingo Molnar
    Acked-by: Martin Schwidefsky
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Mackerras
     

29 Mar, 2006

1 commit


23 Mar, 2006

1 commit

  • When we stop allocating percpu memory for not-possible CPUs we must not touch
    the percpu data for not-possible CPUs at all. The correct way of doing this
    is to test cpu_possible() or to use for_each_cpu().

    This patch is a kernel-wide sweep of all instances of NR_CPUS. I found very
    few instances of this bug, if any. But the patch converts lots of open-coded
    test to use the preferred helper macros.

    Cc: Mikael Starvik
    Cc: David Howells
    Acked-by: Kyle McMartin
    Cc: Anton Blanchard
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Paul Mundt
    Cc: "David S. Miller"
    Cc: William Lee Irwin III
    Cc: Andi Kleen
    Cc: Christian Zankel
    Cc: Philippe Elie
    Cc: Nathan Scott
    Cc: Jens Axboe
    Cc: Eric Dumazet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

24 Jun, 2005

1 commit


17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds