Eric Lee / smarc-fsl-linux-kernel

23 Dec, 2011

1 commit

933393f58 percpu: Remove irqsafe_cpu_xxx variants ... Browse Code »
43

We simply say that regular this_cpu use must be safe regardless of
preemption and interrupt state. That has no material change for x86
and s390 implementations of this_cpu operations. However, arches that
do not provide their own implementation for this_cpu operations will
now get code generated that disables interrupts instead of preemption.

-tj: This is part of on-going percpu API cleanup. For detailed
discussion of the subject, please refer to the following thread.

http://thread.gmane.org/gmane.linux.kernel/1222078

Signed-off-by: Christoph Lameter
Signed-off-by: Tejun Heo
LKML-Reference:

Christoph Lameter
2011-12-23 02:40:20 +0800

04 Jun, 2011

1 commit

d4d84fef6 slub: always align cpu_slab to honor cmpxchg_double requirement ... Browse Code »

On an architecture without CMPXCHG_LOCAL but with DEBUG_VM enabled,
the VM_BUG_ON() in __pcpu_double_call_return_bool() will cause an early
panic during boot unless we always align cpu_slab properly.

In principle we could remove the alignment-testing VM_BUG_ON() for
architectures that don't have CMPXCHG_LOCAL, but leaving it in means
that new code will tend not to break x86 even if it is introduced
on another platform, and it's low cost to require alignment.

Acked-by: David Rientjes
Acked-by: Christoph Lameter
Signed-off-by: Chris Metcalf
Signed-off-by: Pekka Enberg

Chris Metcalf
2011-06-04 00:33:49 +0800

05 May, 2011

1 commit

30106b8ce slub: Fix the lockless code on 32-bit platforms with no 64-bit cmpxchg ... Browse Code »

The SLUB allocator use of the cmpxchg_double logic was wrong: it
actually needs the irq-safe one.

That happens automatically when we use the native unlocked 'cmpxchg8b'
instruction, but when compiling the kernel for older x86 CPUs that do
not support that instruction, we fall back to the generic emulation
code.

And if you don't specify that you want the irq-safe version, the generic
code ends up just open-coding the cmpxchg8b equivalent without any
protection against interrupts or preemption. Which definitely doesn't
work for SLUB.

This was reported by Werner Landgraf , who saw
instability with his distro-kernel that was compiled to support pretty
much everything under the sun. Most big Linux distributions tend to
compile for PPro and later, and would never have noticed this problem.

This also fixes the prototypes for the irqsafe cmpxchg_double functions
to use 'bool' like they should.

[ Btw, that whole "generic code defaults to no protection" design just
sounds stupid - if the code needs no protection, there is no reason to
use "cmpxchg_double" to begin with. So we should probably just remove
the unprotected version entirely as pointless. - Linus ]

Signed-off-by: Thomas Gleixner
Reported-and-tested-by: werner
Acked-and-tested-by: Ingo Molnar
Acked-by: Christoph Lameter
Cc: Pekka Enberg
Cc: Jens Axboe
Cc: Tejun Heo
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1105041539050.3005@ionos
Signed-off-by: Ingo Molnar
Signed-off-by: Linus Torvalds

Thomas Gleixner
2011-05-05 05:20:20 +0800

28 Feb, 2011

1 commit

7c3343392 percpu: Generic support for this_cpu_cmpxchg_double() ... Browse Code »

Introduce this_cpu_cmpxchg_double(). this_cpu_cmpxchg_double() allows
the comparison between two consecutive words and replaces them if
there is a match.

bool this_cpu_cmpxchg_double(pcp1, pcp2,
old_word1, old_word2, new_word1, new_word2)

this_cpu_cmpxchg_double does not return the old value (difficult since
there are two words) but a boolean indicating if the operation was
successful.

The first percpu variable must be double word aligned!

-tj: Updated to return bool instead of int, converted size check to
BUILD_BUG_ON() instead of VM_BUG_ON() and other cosmetic changes.

Signed-off-by: Christoph Lameter
Signed-off-by: Tejun Heo

Christoph Lameter
2011-02-28 18:20:03 +0800

18 Dec, 2010

1 commit

2b7124428 percpu: Generic this_cpu_cmpxchg() and this_cpu_xchg support ... Browse Code »

Generic code to provide new per cpu atomic features

this_cpu_cmpxchg
this_cpu_xchg

Fallback occurs to functions using interrupts disable/enable
to ensure correct per cpu atomicity.

Fallback to regular cmpxchg and xchg is not possible since per cpu atomic
semantics include the guarantee that the current cpus per cpu data is
accessed atomically. Use of regular cmpxchg and xchg requires the
determination of the address of the per cpu data before regular cmpxchg
or xchg which therefore cannot be atomically included in an xchg or
cmpxchg without segment override.

tj: - Relocated new ops to conform better to the general organization.
- This patch contains a trivial comment fix.

Signed-off-by: Christoph Lameter
Signed-off-by: Tejun Heo

Christoph Lameter
2010-12-18 22:54:04 +0800

17 Dec, 2010

2 commits

403047754 percpu,x86: relocate this_cpu_add_return() and friends ... Browse Code »

- include/linux/percpu.h: this_cpu_add_return() and friends were
located next to __this_cpu_add_return(). However, the overall
organization is to first group by preemption safeness. Relocate
this_cpu_add_return() and friends to preemption-safe area.

- arch/x86/include/asm/percpu.h: Relocate percpu_add_return_op() after
other more basic operations. Relocate [__]this_cpu_add_return_8()
so that they're first grouped by preemption safeness.

Signed-off-by: Tejun Heo
Cc: Christoph Lameter

Tejun Heo
2010-12-17 23:13:22 +0800
a663ffff1 percpu: Generic support for this_cpu_add, sub, dec, inc_return ... Browse Code »

Introduce generic support for this_cpu_add_return etc.

The fallback is to realize these operations with simpler __this_cpu_ops.

tj: - Reformatted __cpu_size_call_return2() to make it more consistent
with its neighbors.
- Dropped unnecessary temp variable ret__ from
__this_cpu_generic_add_return().

Reviewed-by: Tejun Heo
Reviewed-by: Mathieu Desnoyers
Acked-by: H. Peter Anvin
Signed-off-by: Christoph Lameter
Signed-off-by: Tejun Heo

Christoph Lameter
2010-12-17 22:15:28 +0800

23 Oct, 2010

1 commit

0fc0531e0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: update comments to reflect that percpu allocations are always zero-filled
percpu: Optimize __get_cpu_var()
x86, percpu: Optimize this_cpu_ptr
percpu: clear memory allocated with the km allocator
percpu: fix build breakage on s390 and cleanup build configuration tests
percpu: use percpu allocator on UP too
percpu: reduce PCPU_MIN_UNIT_SIZE to 32k
vmalloc: pcpu_get/free_vm_areas() aren't needed on UP

Fixed up trivial conflicts in include/linux/percpu.h

Linus Torvalds
2010-10-23 08:31:36 +0800

21 Sep, 2010

1 commit

8b8e2ec1e percpu: Add {get,put}_cpu_ptr ... Browse Code »

These are similar to {get,put}_cpu_var() except for dynamically
allocated per-cpu memory.

Signed-off-by: Peter Zijlstra
Acked-by: Tejun Heo
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-09-21 19:55:43 +0800

08 Sep, 2010

2 commits

bbddff054 percpu: use percpu allocator on UP too ... Browse Code »
43

On UP, percpu allocations were redirected to kmalloc. This has the
following problems.

* For certain amount of allocations (determined by
PERCPU_DYNAMIC_EARLY_SLOTS and PERCPU_DYNAMIC_EARLY_SIZE), percpu
allocator can be used before the usual kernel memory allocator is
brought online. On SMP, this is used to initialize the kernel
memory allocator.

* percpu allocator honors alignment upto PAGE_SIZE but kmalloc()
doesn't. For example, workqueue makes use of larger alignments for
cpu_workqueues.

Currently, users of percpu allocators need to handle UP differently,
which is somewhat fragile and ugly. Other than small amount of
memory, there isn't much to lose by enabling percpu allocator on UP.
It can simply use kernel memory based chunk allocation which was added
for SMP archs w/o MMUs.

This patch removes mm/percpu_up.c, builds mm/percpu.c on UP too and
makes UP build use percpu-km. As percpu addresses and kernel
addresses are always identity mapped and static percpu variables don't
need any special treatment, nothing is arch dependent and mm/percpu.c
implements generic setup_per_cpu_areas() for UP.

Signed-off-by: Tejun Heo
Reviewed-by: Christoph Lameter
Acked-by: Pekka Enberg

Tejun Heo
2010-09-08 17:11:23 +0800
6abad5aca percpu: reduce PCPU_MIN_UNIT_SIZE to 32k ... Browse Code »

In preparation of enabling percpu allocator for UP, reduce
PCPU_MIN_UNIT_SIZE to 32k. On UP, the first chunk doesn't have to
include static percpu variables and chunk size can be smaller which is
important as UP percpu allocator will use contiguous kernel memory to
populate chunks.

PCPU_MIN_UNIT_SIZE also determines the maximum supported allocation
size but 32k should still be enough.

Signed-off-by: Tejun Heo
Reviewed-by: Christoph Lameter

Tejun Heo
2010-09-08 17:11:12 +0800

07 Aug, 2010

1 commit

18cb2aef9 percpu: handle __percpu notations in UP accessors ... Browse Code »

UP accessors didn't take care of __percpu notations leading to a lot
of spurious sparse warnings on UP configurations. Fix it.

Signed-off-by: Namhyung Kim
Signed-off-by: Tejun Heo

Namhyung Kim
2010-08-07 20:20:53 +0800

28 Jun, 2010

2 commits

099a19d91 percpu: allow limited allocation before slab is online ... Browse Code »

This patch updates percpu allocator such that it can serve limited
amount of allocation before slab comes online. This is primarily to
allow slab to depend on working percpu allocator.

Two parameters, PERCPU_DYNAMIC_EARLY_SIZE and SLOTS, determine how
much memory space and allocation map slots are reserved. If this
reserved area is exhausted, WARN_ON_ONCE() will trigger and allocation
will fail till slab comes online.

The following changes are made to implement early alloc.

* pcpu_mem_alloc() now checks slab_is_available()

* Chunks are allocated using pcpu_mem_alloc()

* Init paths make sure ai->dyn_size is at least as large as
PERCPU_DYNAMIC_EARLY_SIZE.

* Initial alloc maps are allocated in __initdata and copied to
kmalloc'd areas once slab is online.

Signed-off-by: Tejun Heo
Cc: Christoph Lameter

Tejun Heo
2010-06-28 00:50:00 +0800
4ba6ce250 percpu: make @dyn_size always mean min dyn_size in first chunk init functions ... Browse Code »

In pcpu_build_alloc_info() and pcpu_embed_first_chunk(), @dyn_size was
ssize_t, -1 meant auto-size, 0 forced 0 and positive meant minimum
size. There's no use case for forcing 0 and the upcoming early alloc
support always requires non-zero dynamic size. Make @dyn_size always
mean minimum dyn_size.

While at it, make pcpu_build_alloc_info() static which doesn't have
any external caller as suggested by David Rientjes.

Signed-off-by: Tejun Heo
Cc: David Rientjes

Tejun Heo
2010-06-28 00:49:59 +0800

06 Apr, 2010

1 commit

b66696e3c Merge branch 'slabh' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc ... Browse Code »

* 'slabh' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc:
eeepc-wmi: include slab.h
staging/otus: include slab.h from usbdrv.h
percpu: don't implicitly include slab.h from percpu.h
kmemcheck: Fix build errors due to missing slab.h
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
iwlwifi: don't include iwl-dev.h from iwl-devtrace.h
x86: don't include slab.h from arch/x86/include/asm/pgtable_32.h

Fix up trivial conflicts in include/linux/percpu.h due to
is_kernel_percpu_address() having been introduced since the slab.h
cleanup with the percpu_up.c splitup.

Linus Torvalds
2010-04-06 00:39:11 +0800

30 Mar, 2010

1 commit

de380b55f percpu: don't implicitly include slab.h from percpu.h ... Browse Code »

percpu.h has always been including slab.h to get k[mz]alloc/free() for
UP inline implementation. percpu.h being used by very low level
headers including module.h and sched.h, this meant that a lot files
unintentionally got slab.h inclusion.

Lee Schermerhorn was trying to make topology.h use percpu.h and got
bitten by this implicit inclusion. The right thing to do is break
this ultimately unnecessary dependency. The previous patch added
explicit inclusion of either gfp.h or slab.h to the source files using
them. This patch updates percpu.h such that slab.h is no longer
included from percpu.h.

Signed-off-by: Tejun Heo
Reviewed-by: Christoph Lameter
Cc: Ingo Molnar
Cc: Lee Schermerhorn

Tejun Heo
2010-03-30 21:02:32 +0800

29 Mar, 2010

1 commit

10fad5e46 percpu, module: implement and use is_kernel/module_percpu_address() ... Browse Code »

lockdep has custom code to check whether a pointer belongs to static
percpu area which is somewhat broken. Implement proper
is_kernel/module_percpu_address() and replace the custom code.

On UP, percpu variables are regular static variables and can't be
distinguished from them. Always return %false on UP.

Signed-off-by: Tejun Heo
Acked-by: Peter Zijlstra
Cc: Rusty Russell
Cc: Ingo Molnar

Tejun Heo
2010-03-29 22:07:12 +0800

05 Jan, 2010

1 commit

32032df6c Merge branch 'master' into percpu ... Browse Code »

Conflicts:
arch/powerpc/platforms/pseries/hvCall.S
include/linux/percpu.h

Tejun Heo
2010-01-05 08:17:33 +0800

08 Dec, 2009

1 commit

50de1a8ef Merge branch 'for-linus' into for-next ... Browse Code »

Conflicts:
mm/percpu.c

Tejun Heo
2009-12-08 09:02:12 +0800

02 Dec, 2009

1 commit

ee0a6efc1 percpu: add missing per_cpu_ptr_to_phys() definition for UP ... Browse Code »

Commit 3b034b0d084221596bf35c8d893e1d4d5477b9cc implemented
per_cpu_ptr_to_phys() but forgot to add UP definition. Add UP
definition which is simple wrapper around __pa().

Signed-off-by: Tejun Heo
Cc: Vivek Goyal
Reported-by: Randy Dunlap

Tejun Heo
2009-12-02 07:36:58 +0800

25 Nov, 2009

1 commit

3b034b0d0 percpu: Fix kdump failure if booted with percpu_alloc=page ... Browse Code »

o kdump functionality reserves a per cpu area at boot time and exports the
physical address of that area to user space through sys interface. This
area stores some dump related information like cpu register states etc
at the time of crash.

o We were assuming that per cpu area always come from linearly mapped meory
region and using __pa() to determine physical address.
With percpu_alloc=page, per cpu area can come from vmalloc region also and
__pa() breaks.

o This patch implments a new function to convert per cpu address to
physical address.

Before the patch, crash_notes addresses looked as follows.

cpu0 60fffff49800
cpu1 60fffff60800
cpu2 60fffff77800

These are bogus phsyical addresses.

After the patch, address are following.

cpu0 13eb44000
cpu1 13eb43000
cpu2 13eb42000
cpu3 13eb41000

These look fine. I got 4G of memory and /proc/iomem tell me following.

100000000-13fffffff : System RAM

tj: * added missing asm/io.h include reported by Stephen Rothwell
* repositioned per_cpu_ptr_phys() in percpu.c and added comment.

Signed-off-by: Vivek Goyal
Signed-off-by: Tejun Heo
Cc: Stephen Rothwell

Vivek Goyal
2009-11-25 20:49:22 +0800

29 Oct, 2009

6 commits

545695fb4 percpu: make accessors check for percpu pointer in sparse ... Browse Code »

The previous patch made sparse warn about percpu variables being used
directly without going through percpu accessors. This patch
implements the other half - checking whether non percpu variable is
passed into percpu accessors.

Signed-off-by: Tejun Heo
Cc: Rusty Russell
Cc: Al Viro

Tejun Heo
2009-10-29 21:34:15 +0800
e0fdb0e05 percpu: add __percpu for sparse. ... Browse Code »

We have to make __kernel "__attribute__((address_space(0)))" so we can
cast to it.

tj: * put_cpu_var() update.

* Annotations added to dynamic allocator interface.

Signed-off-by: Rusty Russell
Cc: Al Viro
Signed-off-by: Tejun Heo

Rusty Russell
2009-10-29 21:34:15 +0800
f7b64fe80 percpu: make access macros universal ... Browse Code »

Now that per_cpu__ prefix is gone, there's no distinction between
static and dynamic percpu variables. Make get_cpu_var() take dynamic
percpu variables and ensure that all macros have parentheses around
the parameter evaluation and evaluate the variable parameter only once
such that any expression which evaluates to percpu address can be used
safely.

Signed-off-by: Tejun Heo

Tejun Heo
2009-10-29 21:34:15 +0800
dd17c8f72 percpu: remove per_cpu__ prefix. ... Browse Code »

Now that the return from alloc_percpu is compatible with the address
of per-cpu vars, it makes sense to hand around the address of per-cpu
variables. To make this sane, we remove the per_cpu__ prefix we used
created to stop people accidentally using these vars directly.

Now we have sparse, we can use that (next patch).

tj: * Updated to convert stuff which were missed by or added after the
original patch.

* Kill per_cpu_var() macro.

Signed-off-by: Rusty Russell
Signed-off-by: Tejun Heo
Reviewed-by: Christoph Lameter

Rusty Russell
2009-10-29 21:34:15 +0800
0f5e4816d percpu: remove some sparse warnings ... Browse Code »

Make the following changes to remove some sparse warnings.

* Make DEFINE_PER_CPU_SECTION() declare __pcpu_unique_* before
defining it.

* Annotate pcpu_extend_area_map() that it is entered with pcpu_lock
held, releases it and then reacquires it.

* Make percpu related macros use unique nested variable names.

* While at it, add pcpu prefix to __size_call[_return]() macros as
to-be-implemented sparse annotations will add percpu specific stuff
to these macros.

Signed-off-by: Tejun Heo
Reviewed-by: Christoph Lameter
Cc: Rusty Russell

Tejun Heo
2009-10-29 21:34:12 +0800
64ef291f4 percpu: make alloc_percpu() handle array types ... Browse Code »

alloc_percpu() couldn't handle array types like "int [100]" due to the
way return type was casted. Fix it by using typeof() instead.

Signed-off-by: Tejun Heo
Reviewed-by: Frederic Weisbecker
Reviewed-by: Christoph Lameter

Tejun Heo
2009-10-29 21:34:12 +0800

03 Oct, 2009

1 commit

7340a0b15 this_cpu: Introduce this_cpu_ptr() and generic this_cpu_* operations ... Browse Code »

This patch introduces two things: First this_cpu_ptr and then per cpu
atomic operations.

this_cpu_ptr
------------

A common operation when dealing with cpu data is to get the instance of the
cpu data associated with the currently executing processor. This can be
optimized by

this_cpu_ptr(xx) = per_cpu_ptr(xx, smp_processor_id).

The problem with per_cpu_ptr(x, smp_processor_id) is that it requires
an array lookup to find the offset for the cpu. Processors typically
have the offset for the current cpu area in some kind of (arch dependent)
efficiently accessible register or memory location.

We can use that instead of doing the array lookup to speed up the
determination of the address of the percpu variable. This is particularly
significant because these lookups occur in performance critical paths
of the core kernel. this_cpu_ptr() can avoid memory accesses and

this_cpu_ptr comes in two flavors. The preemption context matters since we
are referring the the currently executing processor. In many cases we must
insure that the processor does not change while a code segment is executed.

__this_cpu_ptr -> Do not check for preemption context
this_cpu_ptr -> Check preemption context

The parameter to these operations is a per cpu pointer. This can be the
address of a statically defined per cpu variable (&per_cpu_var(xxx)) or
the address of a per cpu variable allocated with the per cpu allocator.

per cpu atomic operations: this_cpu_*(var, val)
-----------------------------------------------
this_cpu_* operations (like this_cpu_add(struct->y, value) operate on
abitrary scalars that are members of structures allocated with the new
per cpu allocator. They can also operate on static per_cpu variables
if they are passed to per_cpu_var() (See patch to use this_cpu_*
operations for vm statistics).

These operations are guaranteed to be atomic vs preemption when modifying
the scalar. The calculation of the per cpu offset is also guaranteed to
be atomic at the same time. This means that a this_cpu_* operation can be
safely used to modify a per cpu variable in a context where interrupts are
enabled and preemption is allowed. Many architectures can perform such
a per cpu atomic operation with a single instruction.

Note that the atomicity here is different from regular atomic operations.
Atomicity is only guaranteed for data accessed from the currently executing
processor. Modifications from other processors are still possible. There
must be other guarantees that the per cpu data is not modified from another
processor when using these instruction. The per cpu atomicity is created
by the fact that the processor either executes and instruction or not.
Embedded in the instruction is the relocation of the per cpu address to
the are reserved for the current processor and the RMW action. Therefore
interrupts or preemption cannot occur in the mids of this processing.

Generic fallback functions are used if an arch does not define optimized
this_cpu operations. The functions come also come in the two flavors used
for this_cpu_ptr().

The firstparameter is a scalar that is a member of a structure allocated
through allocpercpu or a per cpu variable (use per_cpu_var(xxx)). The
operations are similar to what percpu_add() and friends do.

this_cpu_read(scalar)
this_cpu_write(scalar, value)
this_cpu_add(scale, value)
this_cpu_sub(scalar, value)
this_cpu_inc(scalar)
this_cpu_dec(scalar)
this_cpu_and(scalar, value)
this_cpu_or(scalar, value)
this_cpu_xor(scalar, value)

Arch code can override the generic functions and provide optimized atomic
per cpu operations. These atomic operations must provide both the relocation
(x86 does it through a segment override) and the operation on the data in a
single instruction. Otherwise preempt needs to be disabled and there is no
gain from providing arch implementations.

A third variant is provided prefixed by irqsafe_. These variants are safe
against hardware interrupts on the *same* processor (all per cpu atomic
primitives are *always* *only* providing safety for code running on the
*same* processor!). The increment needs to be implemented by the hardware
in such a way that it is a single RMW instruction that is either processed
before or after an interrupt.

cc: David Howells
cc: Ingo Molnar
cc: Rusty Russell
cc: Eric Dumazet
Signed-off-by: Christoph Lameter
Signed-off-by: Tejun Heo

Christoph Lameter
2009-10-03 18:48:22 +0800

02 Oct, 2009

1 commit

23fb064bb percpu: kill legacy percpu allocator ... Browse Code »

With ia64 converted, there's no arch left which still uses legacy
percpu allocator. Kill it.

Signed-off-by: Tejun Heo
Delightedly-acked-by: Rusty Russell
Cc: Ingo Molnar
Cc: Christoph Lameter

Tejun Heo
2009-10-02 12:29:29 +0800

14 Aug, 2009

11 commits

e933a73f4 percpu: kill lpage first chunk allocator ... Browse Code »

With x86 converted to embedding allocator, lpage doesn't have any user
left. Kill it along with cpa handling code.

Signed-off-by: Tejun Heo
Cc: Jan Beulich

Tejun Heo
2009-08-14 14:00:53 +0800
c8826dd53 percpu: update embedding first chunk allocator to handle sparse units ... Browse Code »

Now that percpu core can handle very sparse units, given that vmalloc
space is large enough, embedding first chunk allocator can use any
memory to build the first chunk. This patch teaches
pcpu_embed_first_chunk() about distances between cpus and to use
alloc/free callbacks to allocate node specific areas for each group
and use them for the first chunk.

This brings the benefits of embedding allocator to NUMA configurations
- no extra TLB pressure with the flexibility of unified dynamic
allocator and no need to restructure arch code to build memory layout
suitable for percpu. With units put into atom_size aligned groups
according to cpu distances, using large page for dynamic chunks is
also easily possible with falling back to reuglar pages if large
allocation fails.

Embedding allocator users are converted to specify NULL
cpu_distance_fn, so this patch doesn't cause any visible behavior
difference. Following patches will convert them.

Signed-off-by: Tejun Heo

Tejun Heo
2009-08-14 14:00:52 +0800
fb435d523 percpu: add pcpu_unit_offsets[] ... Browse Code »

Currently units are mapped sequentially into address space. This
patch adds pcpu_unit_offsets[] which allows units to be mapped to
arbitrary offsets from the chunk base address. This is necessary to
allow sparse embedding which might would need to allocate address
ranges and memory areas which aren't aligned to unit size but
allocation atom size (page or large page size). This also simplifies
things a bit by removing the need to calculate offset from unit
number.

With this change, there's no need for the arch code to know
pcpu_unit_size. Update pcpu_setup_first_chunk() and first chunk
allocators to return regular 0 or -errno return code instead of unit
size or -errno.

Signed-off-by: Tejun Heo
Cc: David S. Miller

Tejun Heo
2009-08-14 14:00:51 +0800
fd1e8a1fe percpu: introduce pcpu_alloc_info and pcpu_group_info ... Browse Code »

Till now, non-linear cpu->unit map was expressed using an integer
array which maps each cpu to a unit and used only by lpage allocator.
Although how many units have been placed in a single contiguos area
(group) is known while building unit_map, the information is lost when
the result is recorded into the unit_map array. For lpage allocator,
as all allocations are done by lpages and whether two adjacent lpages
are in the same group or not is irrelevant, this didn't cause any
problem. Non-linear cpu->unit mapping will be used for sparse
embedding and this grouping information is necessary for that.

This patch introduces pcpu_alloc_info which contains all the
information necessary for initializing percpu allocator.
pcpu_alloc_info contains array of pcpu_group_info which describes how
units are grouped and mapped to cpus. pcpu_group_info also has
base_offset field to specify its offset from the chunk's base address.
pcpu_build_alloc_info() initializes this field as if all groups are
allocated back-to-back as is currently done but this will be used to
sparsely place groups.

pcpu_alloc_info is a rather complex data structure which contains a
flexible array which in turn points to nested cpu_map arrays.

* pcpu_alloc_alloc_info() and pcpu_free_alloc_info() are provided to
help dealing with pcpu_alloc_info.

* pcpu_lpage_build_unit_map() is updated to build pcpu_alloc_info,
generalized and renamed to pcpu_build_alloc_info().
@cpu_distance_fn may be NULL indicating that all cpus are of
LOCAL_DISTANCE.

* pcpul_lpage_dump_cfg() is updated to process pcpu_alloc_info,
generalized and renamed to pcpu_dump_alloc_info(). It now also
prints which group each alloc unit belongs to.

* pcpu_setup_first_chunk() now takes pcpu_alloc_info instead of the
separate parameters. All first chunk allocators are updated to use
pcpu_build_alloc_info() to build alloc_info and call
pcpu_setup_first_chunk() with it. This has the side effect of
packing units for sparse possible cpus. ie. if cpus 0, 2 and 4 are
possible, they'll be assigned unit 0, 1 and 2 instead of 0, 2 and 4.

* x86 setup_pcpu_lpage() is updated to deal with alloc_info.

* sparc64 setup_per_cpu_areas() is updated to build alloc_info.

Although the changes made by this patch are pretty pervasive, it
doesn't cause any behavior difference other than packing of sparse
cpus. It mostly changes how information is passed among
initialization functions and makes room for more flexibility.

Signed-off-by: Tejun Heo
Cc: Ingo Molnar
Cc: David Miller

Tejun Heo
2009-08-14 14:00:51 +0800
033e48fb8 percpu: move pcpu_lpage_build_unit_map() and pcpul_lpage_dump_cfg() upward ... Browse Code »

Unit map handling will be generalized and extended and used for
embedding sparse first chunk and other purposes. Relocate two
unit_map related functions upward in preparation. This patch just
moves the code without any actual change.

Signed-off-by: Tejun Heo

Tejun Heo
2009-08-14 14:00:51 +0800
3cbc85652 percpu: add @align to pcpu_fc_alloc_fn_t ... Browse Code »

pcpu_fc_alloc_fn_t is about to see more interesting usage, add @align
parameter.

Signed-off-by: Tejun Heo

Tejun Heo
2009-08-14 14:00:50 +0800
1d9d32572 percpu: make @dyn_size mandatory for pcpu_setup_first_chunk() ... Browse Code »

Now that all actual first chunk allocation and copying happen in the
first chunk allocators and helpers, there's no reason for
pcpu_setup_first_chunk() to try to determine @dyn_size automatically.
The only left user is page first chunk allocator. Make it determine
dyn_size like other allocators and make @dyn_size mandatory for
pcpu_setup_first_chunk().

Signed-off-by: Tejun Heo

Tejun Heo
2009-08-14 14:00:50 +0800
9a7737691 percpu: drop @static_size from first chunk allocators ... Browse Code »

First chunk allocators assume percpu areas have been linked using one
of PERCPU_*() macros and depend on __per_cpu_load symbol defined by
those macros, so there isn't much point in passing in static area size
explicitly when it can be easily calculated from __per_cpu_start and
__per_cpu_end. Drop @static_size from all percpu first chunk
allocators and helpers.

Signed-off-by: Tejun Heo

Tejun Heo
2009-08-14 14:00:50 +0800
f58dc01ba percpu: generalize first chunk allocator selection ... Browse Code »

Now that all first chunk allocators are in mm/percpu.c, it makes sense
to make generalize percpu_alloc kernel parameter. Define PCPU_FC_*
and set pcpu_chosen_fc using early_param() in mm/percpu.c. Arch code
can use the set value to determine which first chunk allocator to use.

Signed-off-by: Tejun Heo

Tejun Heo
2009-08-14 14:00:50 +0800
08fc45806 percpu: build first chunk allocators selectively ... Browse Code »

There's no need to build unused first chunk allocators in. Define
CONFIG_NEED_PER_CPU_*_FIRST_CHUNK and let archs enable them
selectively.

Signed-off-by: Tejun Heo

Tejun Heo
2009-08-14 14:00:49 +0800
00ae4064b percpu: rename 4k first chunk allocator to page ... Browse Code »

Page size isn't always 4k depending on arch and configuration. Rename
4k first chunk allocator to page.

Signed-off-by: Tejun Heo
Cc: David Howells

Tejun Heo
2009-08-14 14:00:49 +0800