Eric Lee / smarc-fsl-linux-kernel

11 Jan, 2012

3 commits

799f933a8 mm: bootmem: try harder to free pages in bulk ... Browse Code »

The loop that frees pages to the page allocator while bootstrapping tries
to free higher-order blocks only when the starting address is aligned to
that block size. Otherwise it will free all pages on that node
one-by-one.

Change it to free individual pages up to the first aligned block and then
try higher-order frees from there.

Signed-off-by: Johannes Weiner
Cc: Uwe Kleine-König
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2012-01-11 08:30:45 +0800
560a036b3 mm: bootmem: drop superfluous range check when freeing pages in bulk ... Browse Code »

The area node_bootmem_map represents is aligned to BITS_PER_LONG, and all
bits in any aligned word of that map valid. When the represented area
extends beyond the end of the node, the non-existant pages will be marked
as reserved.

As a result, when freeing a page block, doing an explicit range check for
whether that block is within the node's range is redundant as the bitmap
is consulted anyway to see whether all pages in the block are unreserved.

Signed-off-by: Johannes Weiner
Cc: Uwe Kleine-König
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2012-01-11 08:30:44 +0800
9571a9829 bootmem: micro optimize freeing pages in bulk ... Browse Code »

The first entry of bdata->node_bootmem_map holds the data for
bdata->node_min_pfn up to bdata->node_min_pfn + BITS_PER_LONG - 1. So the
test for freeing all pages of a single map entry can be slightly relaxed.

Moreover use DIV_ROUND_UP in another place instead of open coding it.

Signed-off-by: Uwe Kleine-König
Cc: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Uwe Kleine-König
2012-01-11 08:30:44 +0800

31 Oct, 2011

1 commit

b95f1b31b mm: Map most files to use export.h instead of module.h ... Browse Code »

The files changed within are only using the EXPORT_SYMBOL
macro variants. They are not using core modular infrastructure
and hence don't need module.h but only the export.h header.

Signed-off-by: Paul Gortmaker

Paul Gortmaker
2011-10-31 21:20:12 +0800

24 Mar, 2011

1 commit

93a72052b crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setu… ... Browse Code »
43

…p_elfcorehdr and saved_max_pfn

The Xen PV drivers in a crashed HVM guest can not connect to the dom0
backend drivers because both frontend and backend drivers are still in
connected state. To run the connection reset function only in case of a
crashdump, the is_kdump_kernel() function needs to be available for the PV
driver modules.

Consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn into
kernel/crash_dump.c Also export elfcorehdr_addr to make is_kdump_kernel()
usable for modules.

Leave 'elfcorehdr' as early_param(). This changes powerpc from __setup()
to early_param(). It adds an address range check from x86 also on ia64
and powerpc.

[akpm@linux-foundation.org: additional #includes]
[akpm@linux-foundation.org: remove elfcorehdr_addr export]
[akpm@linux-foundation.org: fix for Tejun's mm/nobootmem.c changes]
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Olaf Hering
2011-03-24 10:47:19 +0800

24 Feb, 2011

2 commits

e782ab421 bootmem: Move contig_page_data definition to bootmem.c/nobootmem.c ... Browse Code »

Now that bootmem.c and nobootmem.c are separate, it's cleaner to
define contig_page_data in each file than in page_alloc.c with #ifdef.
Move it.

This patch doesn't introduce any behavior change.

-v2: According to Andrew, fixed the struct layout.
-tj: Updated commit description.

Signed-off-by: Yinghai Lu
Acked-by: Andrew Morton
Signed-off-by: Tejun Heo

Yinghai Lu
2011-02-24 21:43:06 +0800
093258732 bootmem: Separate out CONFIG_NO_BOOTMEM code into nobootmem.c ... Browse Code »

mm/bootmem.c contained code paths for both bootmem and no bootmem
configurations. They implement about the same set of APIs in
different ways and as a result bootmem.c contains massive amount of
#ifdef CONFIG_NO_BOOTMEM.

Separate out CONFIG_NO_BOOTMEM code into mm/nobootmem.c. As the
common part is relatively small, duplicate them in nobootmem.c instead
of creating a common file or ifdef'ing in bootmem.c.

The followings are duplicated.

* {min|max}_low_pfn, max_pfn, saved_max_pfn
* free_bootmem_late()
* ___alloc_bootmem()
* __alloc_bootmem_low()

The followings are applicable only to nobootmem and moved verbatim.

* __free_pages_memory()
* free_all_memory_core_early()

The followings are not applicable to nobootmem and omitted in
nobootmem.c.

* reserve_bootmem_node()
* reserve_bootmem()

The rest split function bodies according to CONFIG_NO_BOOTMEM.

Makefile is updated so that only either bootmem.c or nobootmem.c is
built according to CONFIG_NO_BOOTMEM.

This patch doesn't introduce any behavior change.

-tj: Rewrote commit description.

Suggested-by: Ingo Molnar
Signed-off-by: Yinghai Lu
Acked-by: Andrew Morton
Signed-off-by: Tejun Heo

Yinghai Lu
2011-02-24 21:43:05 +0800

28 Aug, 2010

3 commits

a9ce6bc15 x86, memblock: Replace e820_/_early string with memblock_ ... Browse Code »

1.include linux/memblock.h directly. so later could reduce e820.h reference.
2 this patch is done by sed scripts mainly

-v2: use MEMBLOCK_ERROR instead of -1ULL or -1UL

Signed-off-by: Yinghai Lu
Signed-off-by: H. Peter Anvin

Yinghai Lu
2010-08-28 02:13:47 +0800
72d7c3b33 x86: Use memblock to replace early_res ... Browse Code »

1. replace find_e820_area with memblock_find_in_range
2. replace reserve_early with memblock_x86_reserve_range
3. replace free_early with memblock_x86_free_range.
4. NO_BOOTMEM will switch to use memblock too.
5. use _e820, _early wrap in the patch, in following patch, will
replace them all
6. because memblock_x86_free_range support partial free, we can remove some special care
7. Need to make sure that memblock_find_in_range() is called after memblock_x86_fill()
so adjust some calling later in setup.c::setup_arch()
-- corruption_check and mptable_update

-v2: Move reserve_brk() early
Before fill_memblock_area, to avoid overlap between brk and memblock_find_in_range()
that could happen We have more then 128 RAM entry in E820 tables, and
memblock_x86_fill() could use memblock_find_in_range() to find a new place for
memblock.memory.region array.
and We don't need to use extend_brk() after fill_memblock_area()
So move reserve_brk() early before fill_memblock_area().
-v3: Move find_smp_config early
To make sure memblock_find_in_range not find wrong place, if BIOS doesn't put mptable
in right place.
-v4: Treat RESERVED_KERN as RAM in memblock.memory. and they are already in
memblock.reserved already..
use __NOT_KEEP_MEMBLOCK to make sure memblock related code could be freed later.
-v5: Generic version __memblock_find_in_range() is going from high to low, and for 32bit
active_region for 32bit does include high pages
need to replace the limit with memblock.default_alloc_limit, aka get_max_mapped()
-v6: Use current_limit instead
-v7: check with MEMBLOCK_ERROR instead of -1ULL or -1L
-v8: Set memblock_can_resize early to handle EFI with more RAM entries
-v9: update after kmemleak changes in mainline

Suggested-by: David S. Miller
Suggested-by: Benjamin Herrenschmidt
Suggested-by: Thomas Gleixner
Signed-off-by: Yinghai Lu
Signed-off-by: H. Peter Anvin

Yinghai Lu
2010-08-28 02:12:29 +0800
f88eff74a bootmem, x86: Add weak version of reserve_bootmem_generic ... Browse Code »

It will be used memblock_x86_to_bootmem converting

It is an wrapper for reserve_bootmem, and x86 64bit is using special one.

Also clean up that version for x86_64. We don't need to take care of numa
path for that, bootmem can handle it how

Signed-off-by: Yinghai Lu
Signed-off-by: H. Peter Anvin

Yinghai Lu
2010-08-28 02:08:13 +0800

21 Jul, 2010

1 commit

b8ab9f820 x86,nobootmem: make alloc_bootmem_node fall back to other node when 32bit numa is used ... Browse Code »

Borislav Petkov reported his 32bit numa system has problem:

[ 0.000000] Reserving total of 4c00 pages for numa KVA remap
[ 0.000000] kva_start_pfn ~ 32800 max_low_pfn ~ 375fe
[ 0.000000] max_pfn = 238000
[ 0.000000] 8202MB HIGHMEM available.
[ 0.000000] 885MB LOWMEM available.
[ 0.000000] mapped low ram: 0 - 375fe000
[ 0.000000] low ram: 0 - 375fe000
[ 0.000000] alloc (nid=8 100000 - 7ee00000) (1000000 - ffffffff) 1000 1000 => 34e7000
[ 0.000000] alloc (nid=8 100000 - 7ee00000) (1000000 - ffffffff) 200 40 => 34c9d80
[ 0.000000] alloc (nid=0 100000 - 7ee00000) (1000000 - ffffffffffffffff) 180 40 => 34e6140
[ 0.000000] alloc (nid=1 80000000 - c7e60000) (1000000 - ffffffffffffffff) 240 40 => 80000000
[ 0.000000] BUG: unable to handle kernel paging request at 40000000
[ 0.000000] IP: [] __alloc_memory_core_early+0x147/0x1d6
[ 0.000000] *pdpt = 0000000000000000 *pde = f000ff53f000ff00
...
[ 0.000000] Call Trace:
[ 0.000000] [] ? __alloc_bootmem_node+0x216/0x22f
[ 0.000000] [] ? sparse_early_usemaps_alloc_node+0x5a/0x10b
[ 0.000000] [] ? sparse_init+0x1dc/0x499
[ 0.000000] [] ? paging_init+0x168/0x1df
[ 0.000000] [] ? native_pagetable_setup_start+0xef/0x1bb

looks like it allocates too much high address for bootmem.

Try to cut limit with get_max_mapped()

Reported-by: Borislav Petkov
Tested-by: Conny Seidel
Signed-off-by: Yinghai Lu
Cc: [2.6.34.x]
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Thomas Gleixner
Cc: Johannes Weiner
Cc: Lee Schermerhorn
Cc: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Yinghai Lu
2010-07-21 07:25:40 +0800

08 Apr, 2010

1 commit

fb1ae6357 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/… ... Browse Code »

…git/x86/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip:
x86: Fix double enable_IR_x2apic() call on SMP kernel on !SMP boards
x86: Increase CONFIG_NODES_SHIFT max to 10
ibft, x86: Change reserve_ibft_region() to find_ibft_region()
x86, hpet: Fix bug in RTC emulation
x86, hpet: Erratum workaround for read after write of HPET comparator
bootmem, x86: Fix 32bit numa system without RAM on node 0
nobootmem, x86: Fix 32bit numa system without RAM on node 0
x86: Handle overlapping mptables
x86: Make e820_remove_range to handle all covered case
x86-32, resume: do a global tlb flush in S4 resume

Linus Torvalds
2010-04-08 02:02:23 +0800

02 Apr, 2010

2 commits

aa235fc71 bootmem, x86: Fix 32bit numa system without RAM on node 0 ... Browse Code »

When 32bit numa is used, free_all_bootmem() will still only go over with
node id 0.

If node 0 doesn't have RAM installed, the lowest populated node
becomes low RAM.

This one fixes BOOTMEM path by iterating over the bdata_list.

-v3: add more comments, and fix bootmem path too.
-v4: seperate from one big patch

Signed-off-by: Yinghai Lu
LKML-Reference:
Signed-off-by: H. Peter Anvin

Yinghai Lu
2010-04-02 05:41:19 +0800
337998587 nobootmem, x86: Fix 32bit numa system without RAM on node 0 ... Browse Code »

On one system without RAM on node0, got following boot dump with a 32
bit NUMA kernel:

early_node_map[4] active PFN ranges
1: 0x00000010 -> 0x00000099
1: 0x00000100 -> 0x0007da00
1: 0x0007e800 -> 0x0007ffa0
1: 0x0007ffae -> 0x0007ffb0
...
Subtract (29 early reservations)
#000 [0000001000 - 0000002000]
#001 [0000089000 - 000008f000]
#002 [0000091000 - 0000093500]
...
#027 [007cbfef40 - 007e800000]
#028 [007e9ca000 - 007ff95000]
(0 free memory ranges)
Initializing HighMem for node 0 (00000000:00000000)
Initializing HighMem for node 1 (00000000:00000000)
Memory: 0k/2096832k available (6662k kernel code, 2096300k reserved, 4829k data, 484k init, 0k highmem)
...
Checking if this processor honours the WP bit even in supervisor mode...Ok.
swapper: page allocation failure. order:0, mode:0x0
Pid: 0, comm: swapper Not tainted 2.6.34-rc3-tip-03818-g4b1ea6c-dirty #35
Call Trace:
[] ? printk+0xf/0x11
[] __alloc_pages_nodemask+0x417/0x487
[] new_slab+0xe2/0x1fe
[] kmem_cache_open+0x185/0x358
[] T.954+0x1c/0x60
[] kmem_cache_init+0x24/0x113
[] start_kernel+0x166/0x2e4
[] ? unknown_bootoption+0x0/0x18e
[] i386_start_kernel+0xce/0xd5
Mem-Info:
Node 1 DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
Node 1 Normal per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
active_anon:0 inactive_anon:0 isolated_anon:0
active_file:0 inactive_file:0 isolated_file:0
unevictable:0 dirty:0 writeback:0 unstable:0
free:0 slab_reclaimable:0 slab_unreclaimable:0
mapped:0 shmem:0 pagetables:0 bounce:0

When 32bit NUMA is used, free_all_bootmem() will still only go over with
node id 0.

If node 0 doesn't have RAM installed, We need to go with node1
because early_node_map still use 1 for all ranges, and ram from node1
become low ram.

Use MAX_NUMNODES like 64-bit NUMA does.

Note: BOOTMEM path has the same problem.
this bug exist before We have NO_BOOTMEM support.

-v3: add more comments, and fix bootmem path too.
-v4: seperate bootmem path fix

Signed-off-by: Yinghai Lu
LKML-Reference:
Signed-off-by: H. Peter Anvin

Yinghai Lu
2010-04-02 05:39:29 +0800

30 Mar, 2010

1 commit

5a0e3ad6a include cleanup: Update gfp.h and slab.h includes to prepare for breaking implic… ... Browse Code »

…it slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

Tejun Heo
2010-03-30 21:02:32 +0800

25 Mar, 2010

1 commit

c26f91a3d x86: Remove excessive early_res debug output ... Browse Code »

Commit 08677214e318297 ("x86: Make 64 bit use early_res instead
of bootmem before slab") introduced early_res replacement for
bootmem, but left code in __free_pages_memory() which dumps all
the ranges that are beeing freed, without any additional
information, causing some noise in dmesg during bootup.

Just remove printing of the ranges, that doesn't provide
anything useful anyway.

While at it, remove other commented-out KERN_DEBUG messages in
the NO_BOOTMEM code as well.

Signed-off-by: Jiri Kosina
Found-OK-by: Andrew Morton
Cc: Johannes Weiner
Cc: Yinghai Lu
LKML-Reference:
Signed-off-by: Ingo Molnar

Jiri Kosina
2010-03-25 02:51:08 +0800

13 Feb, 2010

1 commit

08677214e x86: Make 64 bit use early_res instead of bootmem before slab ... Browse Code »

Finally we can use early_res to replace bootmem for x86_64 now.

Still can use CONFIG_NO_BOOTMEM to enable it or not.

-v2: fix 32bit compiling about MAX_DMA32_PFN
-v3: folded bug fix from LKML message below

Signed-off-by: Yinghai Lu
LKML-Reference:
Signed-off-by: H. Peter Anvin

Yinghai Lu
2010-02-13 01:41:59 +0800

16 Dec, 2009

1 commit

8aa043d74 mm/bootmem.c: properly __init-annotate helper functions ... Browse Code »

Signed-off-by: Jan Beulich
Cc: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Beulich
2009-12-16 00:53:20 +0800

10 Nov, 2009

1 commit

9f993ac3f bootmem: Add free_bootmem_late() ... Browse Code »

Add a new function for freeing bootmem after the bootmem
allocator has been released and the unreserved pages given to
the page allocator.

This allows us to reserve bootmem and then release it if we
later discover it was not needed.

( This new API will be used by the swiotlb code to recover
a significant amount of RAM (64MB). )

Signed-off-by: FUJITA Tomonori
Acked-by: Pekka Enberg
Cc: chrisw@sous-sol.org
Cc: dwmw2@infradead.org
Cc: joerg.roedel@amd.com
Cc: muli@il.ibm.com
Cc: hannes@cmpxchg.org
Cc: tj@kernel.org
Cc: akpm@linux-foundation.org
Cc: Linus Torvalds
LKML-Reference:
Signed-off-by: Ingo Molnar

FUJITA Tomonori
2009-11-10 19:31:43 +0800

27 Aug, 2009

1 commit

008139d91 kmemleak: Do not report alloc_bootmem blocks as leaks ... Browse Code »

This patch sets the min_count for alloc_bootmem objects to 0 so that
they are never reported as leaks. This is because many of these blocks
are only referred via the physical address which is not looked up by
kmemleak.

Signed-off-by: Catalin Marinas
Cc: Pekka Enberg

Catalin Marinas
2009-08-27 21:29:17 +0800

08 Jul, 2009

1 commit

ec3a354bd kmemleak: Add callbacks to the bootmem allocator ... Browse Code »

This patch adds kmemleak_alloc/free callbacks to the bootmem allocator.
This would allow scanning of such blocks and help avoiding a whole class
of false positives and more kmemleak annotations.

Signed-off-by: Catalin Marinas
Cc: Ingo Molnar
Acked-by: Pekka Enberg
Reviewed-by: Johannes Weiner

Catalin Marinas
2009-07-08 21:25:14 +0800

20 Jun, 2009

1 commit

433f13a72 bootmem.c: avoid c90 declaration warning ... Browse Code »

[akpm@linux-foundation.org: cleanup]
Signed-off-by: Joe Perches
Cc: Pekka Enberg
Cc: Benjamin Herrenschmidt
Cc: Tejun Heo
Cc: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2009-06-20 07:46:04 +0800

12 Jun, 2009

2 commits

c91c4773b bootmem: fix slab fallback on numa ... Browse Code »

If the user requested bootmem allocation on a specific node, we should use
kzalloc_node() for the fallback allocation.

Cc: Ingo Molnar
Cc: Johannes Weiner
Cc: Linus Torvalds
Cc: Yinghai Lu
Signed-off-by: Pekka Enberg

Pekka Enberg
2009-06-12 00:15:54 +0800
441c7e0a2 bootmem: use slab if bootmem is no longer available ... Browse Code »

As a preparation for initializing the slab allocator early, make sure the
bootmem allocator does not crash and burn if someone calls it after slab is up;
otherwise we'd need a flag day for switching to early slab.

Acked-by: Johannes Weiner
Acked-by: Linus Torvalds
Cc: Christoph Lameter
Cc: Ingo Molnar
Cc: Matt Mackall
Cc: Nick Piggin
Cc: Yinghai Lu
Signed-off-by: Pekka Enberg

Pekka Enberg
2009-06-12 00:10:41 +0800

01 Mar, 2009

1 commit

d0c4f5702 bootmem, x86: further fixes for arch-specific bootmem wrapping ... Browse Code »

Impact: fix new breakages introduced by previous fix

Commit c132937556f56ee4b831ef4b23f1846e05fde102 tried to clean up
bootmem arch wrapper but it wasn't quite correct. Before the commit,
the followings were broken.

* Low level interface functions prefixed with __ ignored arch
preference.

* reserve_bootmem(...) can't be mapped into
reserve_bootmem_node(NODE_DATA(0)->bdata, ...) because the node is
not preference here. The region specified MUST fall into the
specified region; otherwise, it will panic.

After the commit,

* If allocation fails for the arch preferred node, it should fallback
to whatever is available. Instead, it simply failed allocation.

There are too many internal details to allow generic wrapping and
still keep things simple for archs. Plus, all that arch wants is a
way to prefer certain node over another.

This patch drops the generic wrapping around alloc_bootmem_core() and
add alloc_bootmem_core() instead. If necessary, arch can define
bootmem_arch_referred_node() macro or function which takes all
allocation information and returns the preferred node. bootmem
generic code will always try the preferred node first and then
fallback to other nodes as usual.

Breakages noted and changes reviewed by Johannes Weiner.

Signed-off-by: Tejun Heo
Acked-by: Johannes Weiner

Tejun Heo
2009-03-01 15:06:56 +0800

24 Feb, 2009

1 commit

c13293755 bootmem: clean up arch-specific bootmem wrapping ... Browse Code »

Impact: cleaner and consistent bootmem wrapping

By setting CONFIG_HAVE_ARCH_BOOTMEM_NODE, archs can define
arch-specific wrappers for bootmem allocation. However, this is done
a bit strangely in that only the high level convenience macros can be
changed while lower level, but still exported, interface functions
can't be wrapped. This not only is messy but also leads to strange
situation where alloc_bootmem() does what the arch wants it to do but
the equivalent __alloc_bootmem() call doesn't although they should be
able to be used interchangeably.

This patch updates bootmem such that archs can override / wrap the
backend function - alloc_bootmem_core() instead of the highlevel
interface functions to allow simpler and consistent wrapping. Also,
HAVE_ARCH_BOOTMEM_NODE is renamed to HAVE_ARCH_BOOTMEM.

Signed-off-by: Tejun Heo
Cc: Johannes Weiner

Tejun Heo
2009-02-24 10:57:20 +0800

07 Jan, 2009

1 commit

594fe1a04 bootmem: print request details before BUG_ON(them) ... Browse Code »

Moving the request details print-out before the sanity checks that
might panic() enables us to analyse invalid requests without having
access to the line information of the stack dump.

Signed-off-by: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2009-01-07 07:59:10 +0800

17 Oct, 2008

1 commit

80a914dc0 misc: replace __FUNCTION__ with __func__ ... Browse Code »

__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Harvey Harrison
2008-10-17 02:21:30 +0800

21 Aug, 2008

1 commit

481ebd0d7 bootmem: fix aligning of node-relative indexes and offsets ... Browse Code »

Absolute alignment requirements may never be applied to node-relative
offsets. Andreas Herrmann spotted this flaw when a bootmem allocation on
an unaligned node was itself not aligned because the combination of an
unaligned node with an aligned offset into that node is not garuanteed to
be aligned itself.

This patch introduces two helper functions that align a node-relative
index or offset with respect to the node's starting address so that the
absolute PFN or virtual address that results from combining the two
satisfies the requested alignment.

Then all the broken ALIGN()s in alloc_bootmem_core() are replaced by these
helpers.

Signed-off-by: Johannes Weiner
Reported-by: Andreas Herrmann
Debugged-by: Andreas Herrmann
Reviewed-by: Andreas Herrmann
Tested-by: Andreas Herrmann
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2008-08-21 06:40:31 +0800

15 Aug, 2008

1 commit

627240aaa bootmem allocator: alloc_bootmem_core(): page-align the end offset ... Browse Code »

This is the minimal sequence that jams the allocator:

void *p, *q, *r;
p = alloc_bootmem(PAGE_SIZE);
q = alloc_bootmem(64);
free_bootmem(p, PAGE_SIZE);
p = alloc_bootmem(PAGE_SIZE);
r = alloc_bootmem(64);

after this sequence (assuming that the allocator was empty or page-aligned
before), pointer "q" will be equal to pointer "r".

What's hapenning inside the allocator:
p = alloc_bootmem(PAGE_SIZE);
in allocator: last_end_off == PAGE_SIZE, bitmap contains bits 10000...
q = alloc_bootmem(64);
in allocator: last_end_off == PAGE_SIZE + 64, bitmap contains 11000...
free_bootmem(p, PAGE_SIZE);
in allocator: last_end_off == PAGE_SIZE + 64, bitmap contains 01000...
p = alloc_bootmem(PAGE_SIZE);
in allocator: last_end_off == PAGE_SIZE, bitmap contains 11000...
r = alloc_bootmem(64);

and now:

it finds bit "2", as a place where to allocate (sidx)

it hits the condition

if (bdata->last_end_off && PFN_DOWN(bdata->last_end_off) + 1 == sidx))
start_off = ALIGN(bdata->last_end_off, align);

-you can see that the condition is true, so it assigns start_off =
ALIGN(bdata->last_end_off, align); (that is PAGE_SIZE) and allocates
over already allocated block.

With the patch it tries to continue at the end of previous allocation only
if the previous allocation ended in the middle of the page.

Signed-off-by: Mikulas Patocka
Acked-by: Johannes Weiner
Cc: David Miller
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mikulas Patocka
2008-08-15 23:35:41 +0800

25 Jul, 2008

10 commits

3560e249a bootmem: replace node_boot_start in struct bootmem_data ... Browse Code »

Almost all users of this field need a PFN instead of a physical address,
so replace node_boot_start with node_min_pfn.

[Lee.Schermerhorn@hp.com: fix spurious BUG_ON() in mark_bootmem()]
Signed-off-by: Johannes Weiner
Cc:
Signed-off-by: Lee Schermerhorn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2008-07-25 01:47:20 +0800
75a56cfe9 bootmem: revisit alloc_bootmem_section ... Browse Code »

Since alloc_bootmem_core does no goal-fallback anymore and just returns
NULL if the allocation fails, we might now use it in alloc_bootmem_section
without all the fixup code for a misplaced allocation.

Also, the limit can be the first PFN of the next section as the semantics
is that the limit is _above_ the allocated region, not within.

Signed-off-by: Johannes Weiner
Cc: Yasunori Goto
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2008-07-25 01:47:20 +0800
4cc278b72 bootmem: Make __alloc_bootmem_low_node fall back to other nodes ... Browse Code »

__alloc_bootmem_node already does this, make the interface consistent.

Signed-off-by: Johannes Weiner
Cc: Ingo Molnar
Cc: Yinghai Lu
Cc: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2008-07-25 01:47:20 +0800
0f3caba21 bootmem: respect goal more likely ... Browse Code »

The old node-agnostic code tried allocating on all nodes starting from the
one with the lowest range. alloc_bootmem_core retried without the goal if
it could not satisfy it and so the goal was only respected at all when it
happened to be on the first (lowest page numbers) node (or theoretically
if allocations failed on all nodes before to the one holding the goal).

Introduce a non-panicking helper that starts allocating from the node
holding the goal and falls back only after all thes tries failed, thus
moving the goal fallback code out of alloc_bootmem_core.

Make all other allocation functions benefit from this new helper.

Signed-off-by: Johannes Weiner
Cc: Ingo Molnar
Cc: Yinghai Lu
Cc: Andi Kleen
Cc: Yasunori Goto
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2008-07-25 01:47:20 +0800
e2bf3cae5 bootmem: factor out the marking of a PFN range ... Browse Code »

Introduce new helpers that mark a range that resides completely on a node
or node-agnostic ranges that might also span node boundaries.

The free/reserve API functions will then directly use these helpers.

Note that the free/reserve semantics become more strict: while the prior
code took basically arbitrary range arguments and marked the PFNs that
happen to fall into that range, the new code requires node-specific ranges
to be completely on the node. The node-agnostic requests might span node
boundaries as long as the nodes are contiguous.

Passing ranges that do not satisfy these criteria is a bug.

[akpm@linux-foundation.org: fix printk warnings]
Signed-off-by: Johannes Weiner
Cc: Ingo Molnar
Cc: Yinghai Lu
Cc: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2008-07-25 01:47:20 +0800
d747fa4bc bootmem: free/reserve helpers ... Browse Code »

Factor out the common operation of marking a range on the bitmap.

[akpm@linux-foundation.org: fix various warnings]
Signed-off-by: Johannes Weiner
Cc: Ingo Molnar
Cc: Yinghai Lu
Cc: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2008-07-25 01:47:20 +0800
5f2809e69 bootmem: clean up alloc_bootmem_core ... Browse Code »

alloc_bootmem_core has become quite nasty to read over time. This is a
clean rewrite that keeps the semantics.

bdata->last_pos has been dropped.

bdata->last_success has been renamed to hint_idx and it is now an index
relative to the node's range. Since further block searching might start
at this index, it is now set to the end of a succeeded allocation rather
than its beginning.

bdata->last_offset has been renamed to last_end_off to be more clear that
it represents the ending address of the last allocation relative to the
node.

[y-goto@jp.fujitsu.com: fix new alloc_bootmem_core()]
Signed-off-by: Johannes Weiner
Signed-off-by: Yasunori Goto
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2008-07-25 01:47:20 +0800
41546c174 bootmem: clean up free_all_bootmem_core ... Browse Code »
43

Rewrite the code in a more concise way using less variables.

[akpm@linux-foundation.org: fix printk warnings]
Signed-off-by: Johannes Weiner
Cc: Ingo Molnar
Cc: Yinghai Lu
Cc: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2008-07-25 01:47:20 +0800
636cc40cb bootmem: revisit bootmem descriptor list handling ... Browse Code »

link_bootmem handles an insertion of a new descriptor into the sorted list
in more or less three explicit branches; empty list, insert in between and
append. These cases can be expressed implicite.

Also mark the sorted list as initdata as it can be thrown away after boot
as well.

Signed-off-by: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2008-07-25 01:47:20 +0800
df049a5f4 bootmem: revisit bitmap size calculations ... Browse Code »

Reincarnate get_mapsize as bootmap_bytes and implement
bootmem_bootmap_pages on top of it.

Adjust users of these helpers and make free_all_bootmem_core use
bootmem_bootmap_pages instead of open-coding it.

Signed-off-by: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2008-07-25 01:47:20 +0800