17 Oct, 2008
1 commit
-
__FUNCTION__ is gcc-specific, use __func__
Signed-off-by: Harvey Harrison
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
21 Aug, 2008
1 commit
-
Absolute alignment requirements may never be applied to node-relative
offsets. Andreas Herrmann spotted this flaw when a bootmem allocation on
an unaligned node was itself not aligned because the combination of an
unaligned node with an aligned offset into that node is not garuanteed to
be aligned itself.This patch introduces two helper functions that align a node-relative
index or offset with respect to the node's starting address so that the
absolute PFN or virtual address that results from combining the two
satisfies the requested alignment.Then all the broken ALIGN()s in alloc_bootmem_core() are replaced by these
helpers.Signed-off-by: Johannes Weiner
Reported-by: Andreas Herrmann
Debugged-by: Andreas Herrmann
Reviewed-by: Andreas Herrmann
Tested-by: Andreas Herrmann
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
15 Aug, 2008
1 commit
-
This is the minimal sequence that jams the allocator:
void *p, *q, *r;
p = alloc_bootmem(PAGE_SIZE);
q = alloc_bootmem(64);
free_bootmem(p, PAGE_SIZE);
p = alloc_bootmem(PAGE_SIZE);
r = alloc_bootmem(64);after this sequence (assuming that the allocator was empty or page-aligned
before), pointer "q" will be equal to pointer "r".What's hapenning inside the allocator:
p = alloc_bootmem(PAGE_SIZE);
in allocator: last_end_off == PAGE_SIZE, bitmap contains bits 10000...
q = alloc_bootmem(64);
in allocator: last_end_off == PAGE_SIZE + 64, bitmap contains 11000...
free_bootmem(p, PAGE_SIZE);
in allocator: last_end_off == PAGE_SIZE + 64, bitmap contains 01000...
p = alloc_bootmem(PAGE_SIZE);
in allocator: last_end_off == PAGE_SIZE, bitmap contains 11000...
r = alloc_bootmem(64);and now:
it finds bit "2", as a place where to allocate (sidx)
it hits the condition
if (bdata->last_end_off && PFN_DOWN(bdata->last_end_off) + 1 == sidx))
start_off = ALIGN(bdata->last_end_off, align);-you can see that the condition is true, so it assigns start_off =
ALIGN(bdata->last_end_off, align); (that is PAGE_SIZE) and allocates
over already allocated block.With the patch it tries to continue at the end of previous allocation only
if the previous allocation ended in the middle of the page.Signed-off-by: Mikulas Patocka
Acked-by: Johannes Weiner
Cc: David Miller
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
25 Jul, 2008
20 commits
-
Almost all users of this field need a PFN instead of a physical address,
so replace node_boot_start with node_min_pfn.[Lee.Schermerhorn@hp.com: fix spurious BUG_ON() in mark_bootmem()]
Signed-off-by: Johannes Weiner
Cc:
Signed-off-by: Lee Schermerhorn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Since alloc_bootmem_core does no goal-fallback anymore and just returns
NULL if the allocation fails, we might now use it in alloc_bootmem_section
without all the fixup code for a misplaced allocation.Also, the limit can be the first PFN of the next section as the semantics
is that the limit is _above_ the allocated region, not within.Signed-off-by: Johannes Weiner
Cc: Yasunori Goto
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
__alloc_bootmem_node already does this, make the interface consistent.
Signed-off-by: Johannes Weiner
Cc: Ingo Molnar
Cc: Yinghai Lu
Cc: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The old node-agnostic code tried allocating on all nodes starting from the
one with the lowest range. alloc_bootmem_core retried without the goal if
it could not satisfy it and so the goal was only respected at all when it
happened to be on the first (lowest page numbers) node (or theoretically
if allocations failed on all nodes before to the one holding the goal).Introduce a non-panicking helper that starts allocating from the node
holding the goal and falls back only after all thes tries failed, thus
moving the goal fallback code out of alloc_bootmem_core.Make all other allocation functions benefit from this new helper.
Signed-off-by: Johannes Weiner
Cc: Ingo Molnar
Cc: Yinghai Lu
Cc: Andi Kleen
Cc: Yasunori Goto
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Introduce new helpers that mark a range that resides completely on a node
or node-agnostic ranges that might also span node boundaries.The free/reserve API functions will then directly use these helpers.
Note that the free/reserve semantics become more strict: while the prior
code took basically arbitrary range arguments and marked the PFNs that
happen to fall into that range, the new code requires node-specific ranges
to be completely on the node. The node-agnostic requests might span node
boundaries as long as the nodes are contiguous.Passing ranges that do not satisfy these criteria is a bug.
[akpm@linux-foundation.org: fix printk warnings]
Signed-off-by: Johannes Weiner
Cc: Ingo Molnar
Cc: Yinghai Lu
Cc: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Factor out the common operation of marking a range on the bitmap.
[akpm@linux-foundation.org: fix various warnings]
Signed-off-by: Johannes Weiner
Cc: Ingo Molnar
Cc: Yinghai Lu
Cc: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
alloc_bootmem_core has become quite nasty to read over time. This is a
clean rewrite that keeps the semantics.bdata->last_pos has been dropped.
bdata->last_success has been renamed to hint_idx and it is now an index
relative to the node's range. Since further block searching might start
at this index, it is now set to the end of a succeeded allocation rather
than its beginning.bdata->last_offset has been renamed to last_end_off to be more clear that
it represents the ending address of the last allocation relative to the
node.[y-goto@jp.fujitsu.com: fix new alloc_bootmem_core()]
Signed-off-by: Johannes Weiner
Signed-off-by: Yasunori Goto
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Rewrite the code in a more concise way using less variables.
[akpm@linux-foundation.org: fix printk warnings]
Signed-off-by: Johannes Weiner
Cc: Ingo Molnar
Cc: Yinghai Lu
Cc: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
link_bootmem handles an insertion of a new descriptor into the sorted list
in more or less three explicit branches; empty list, insert in between and
append. These cases can be expressed implicite.Also mark the sorted list as initdata as it can be thrown away after boot
as well.Signed-off-by: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Reincarnate get_mapsize as bootmap_bytes and implement
bootmem_bootmap_pages on top of it.Adjust users of these helpers and make free_all_bootmem_core use
bootmem_bootmap_pages instead of open-coding it.Signed-off-by: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Introduce the bootmem_debug kernel parameter that enables very verbose
diagnostics regarding all range operations of bootmem as well as the
initialization and release of nodes.[akpm@linux-foundation.org: fix printk warnings]
Signed-off-by: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Change the description, move a misplaced comment about the allocator
itself and add me to the list of copyright holders.Signed-off-by: Johannes Weiner
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This only reorders functions so that further patches will be easier to
read. No code changed.Signed-off-by: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Straight forward variant of the existing __alloc_bootmem_node, only
subsequent patch when allocating giant hugepages at boot -- don't want to
panic if we can't allocate as many as the user asked for.Signed-off-by: Andi Kleen
Signed-off-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This function has no external callers, so unexport it. Also fix its naming
inconsistency.Signed-off-by: Johannes Weiner
Cc: Ingo Molnar
Cc: Yinghai Lu
Cc: Christoph Lameter
Cc: Mel Gorman
Cc: Andy Whitcroft
Cc: Mel Gorman
Cc: Andy Whitcroft
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
All _core functions only need the bootmem data, not the whole node descriptor.
Adjust the two functions that take the node descriptor unneededly.Signed-off-by: Johannes Weiner
Cc: Ingo Molnar
Cc: Yinghai Lu
Cc: Christoph Lameter
Cc: Mel Gorman
Cc: Andy Whitcroft
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The check for node_boot_start is bogus because we start freeing at the
corresponding pfn. So check if the pfn is properly aligned instead in a more
readable way and adjust the documentation.Also remove an unneeded accounting variable.
Signed-off-by: Johannes Weiner
Cc: Ingo Molnar
Cc: Yinghai Lu
Cc: Christoph Lameter
Cc: Mel Gorman
Cc: Andy Whitcroft
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
There are a lot of places that define either a single bootmem descriptor or an
array of them. Use only one central array with MAX_NUMNODES items instead.Signed-off-by: Johannes Weiner
Acked-by: Ralf Baechle
Cc: Ingo Molnar
Cc: Richard Henderson
Cc: Russell King
Cc: Tony Luck
Cc: Hirokazu Takata
Cc: Geert Uytterhoeven
Cc: Kyle McMartin
Cc: Paul Mackerras
Cc: Paul Mundt
Cc: David S. Miller
Cc: Yinghai Lu
Cc: Christoph Lameter
Cc: Mel Gorman
Cc: Andy Whitcroft
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
There are a number of different views to how much memory is currently active.
There is the arch-independent zone-sizing view, the bootmem allocator and
memory models view.Architectures register this information at different times and is not
necessarily in sync particularly with respect to some SPARSEMEM limitations.This patch introduces mminit_validate_memmodel_limits() which is able to
validate and correct PFN ranges with respect to the memory model. It is only
SPARSEMEM that currently validates itself.Signed-off-by: Mel Gorman
Cc: Christoph Lameter
Cc: Andy Whitcroft
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
22 Jun, 2008
1 commit
-
This patch changes the function reserve_bootmem_node() from void to int,
returning -ENOMEM if the allocation fails.This fixes a build problem on x86 with CONFIG_KEXEC=y and
CONFIG_NEED_MULTIPLE_NODES=ySigned-off-by: Bernhard Walle
Reported-by: Adrian Bunk
Signed-off-by: Linus Torvalds
28 Apr, 2008
2 commits
-
alloc_bootmem_section() can allocate specified section's area. This is used
for usemap to keep same section with pgdat by later patch.Signed-off-by: Yasunori Goto
Cc: Badari Pulavarty
Cc: Yinghai Lu
Cc: Yasunori Goto
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This patch set is to free pages which is allocated by bootmem for
memory-hotremove. Some structures of memory management are allocated by
bootmem. ex) memmap, etc.To remove memory physically, some of them must be freed according to
circumstance. This patch set makes basis to free those pages, and free
memmaps.Basic my idea is using remain members of struct page to remember information
of users of bootmem (section number or node id). When the section is
removing, kernel can confirm it. By this information, some issues can be
solved.1) When the memmap of removing section is allocated on other
section by bootmem, it should/can be free.
2) When the memmap of removing section is allocated on the
same section, it shouldn't be freed. Because the section has to be
logical memory offlined already and all pages must be isolated against
page allocater. If it is freed, page allocator may use it which will
be removed physically soon.
3) When removing section has other section's memmap,
kernel will be able to show easily which section should be removed
before it for user. (Not implemented yet)
4) When the above case 2), the page isolation will be able to check and skip
memmap's page when logical memory offline (offline_pages()).
Current page isolation code fails in this case because this page is
just reserved page and it can't distinguish this pages can be
removed or not. But, it will be able to do by this patch.
(Not implemented yet.)
5) The node information like pgdat has similar issues. But, this
will be able to be solved too by this.
(Not implemented yet, but, remembering node id in the pages.)Fortunately, current bootmem allocator just keeps PageReserved flags,
and doesn't use any other members of page struct. The users of
bootmem doesn't use them too.This patch:
This is to register information which is node or section's id. Kernel can
distinguish which node/section uses the pages allcated by bootmem. This is
basis for hot-remove sections or nodes.Signed-off-by: Yasunori Goto
Cc: Badari Pulavarty
Cc: Yinghai Lu
Cc: Yasunori Goto
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
27 Apr, 2008
3 commits
-
split reserve_bootmem_core() into two functions, one which checks
conflicts, and one which sets the bits.and make reserve_bootmem to loop bdata_list to cross the nodes.
user could be crashkernel and ramdisk..., in case the range provided
by those externalities crosses the nodes.Signed-off-by: Yinghai Lu
Signed-off-by: Ingo Molnar -
need offset alignment when node_boot_start's alignment is less than
the alignment required.use local node_boot_start to match alignment - so don't add extra operation
in search loop.Signed-off-by: Yinghai Lu
Signed-off-by: Ingo Molnar -
Make the nodes other than node 0 use bdata->last_success for fast
search too.We need to use __alloc_bootmem_core() for vmemmap allocation for other
nodes when numa and sparsemem/vmemmap are enabled.Also, make fail_block path increase i with incr only after ALIGN
to avoid extra increase when size is larger than align.Signed-off-by: Yinghai Lu
Signed-off-by: Ingo Molnar
25 Mar, 2008
1 commit
-
With numa enabled, some callers could have a range of memory on one node
but try to free that on other node. This can cause some pages to be
freed wrongly.For example: when we try to allocate 128g boot ram early for
gart/swiotlb, and free that range later so gart/swiotlb can get some
range afterwards.With this patch, we don't need to care which node holds the range, just
loop to call free_bootmem_node for all online nodes.This patch makes free_bootmem_core() more robust by trimming the sidx
and eidx according the ram range that the node has.And make the free_bootmem_core handle this out of range case. We could
use bdata_list to make sure the range can be freed for sure. So next
time, we don't need to loop online nodes and could use free_bootmem
directly.Signed-off-by: Yinghai Lu
Cc: Andi Kleen
Cc: Yasunori Goto
Cc: KAMEZAWA Hiroyuki
Acked-by: Ingo Molnar
Tested-by: Ingo Molnar
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
08 Feb, 2008
1 commit
-
This patchset adds a flags variable to reserve_bootmem() and uses the
BOOTMEM_EXCLUSIVE flag in crashkernel reservation code to detect collisions
between crashkernel area and already used memory.This patch:
Change the reserve_bootmem() function to accept a new flag BOOTMEM_EXCLUSIVE.
If that flag is set, the function returns with -EBUSY if the memory already
has been reserved in the past. This is to avoid conflicts.Because that code runs before SMP initialisation, there's no race condition
inside reserve_bootmem_core().[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix powerpc build]
Signed-off-by: Bernhard Walle
Cc:
Cc: "Eric W. Biederman"
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
08 Dec, 2006
2 commits
-
In time for 2.6.20, we can get rid of this junk.
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
When booting a NUMA system with nodes that have no memory (eg by limiting
memory), bootmem_alloc_core tried to find pages in an uninitialized
bootmem_map. This caused a null pointer access. This fix adds a check, so
that NULL is returned. That will enable the caller (bootmem_alloc_nopanic)
to alloc memory on other without a panic.Signed-off-by: Christian Krafft
Cc: Christoph Lameter
Cc: Andy Whitcroft
Cc: Martin Bligh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
26 Sep, 2006
6 commits
-
Introduce ARCH_LOW_ADDRESS_LIMIT which can be set per architecture to
override the 4GB default limit used by the bootmem allocater within
__alloc_bootmem_low() and __alloc_bootmem_low_node(). E.g. s390 needs a
2GB limit instead of 4GB.Acked-by: Ingo Molnar
Cc: Martin Schwidefsky
Signed-off-by: Heiko Carstens
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
It fixes various coding style issues, specially when spaces are useless. For
example '*' go next to the function name.Signed-off-by: Franck Bui-Huu
Cc: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
It also creates get_mapsize() helper in order to make the code more readable
when it calculates the boot bitmap size.Signed-off-by: Franck Bui-Huu
Cc: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Franck Bui-Huu
Cc: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Franck Bui-Huu
Cc: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Franck Bui-Huu
Cc: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
11 Jul, 2006
1 commit
-
This patch marks an unused export as EXPORT_UNUSED_SYMBOL.
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds