Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

28 Apr, 2008

2 commits

e70260aab memory hotplug: make alloc_bootmem_section() ... Browse Code »

alloc_bootmem_section() can allocate specified section's area. This is used
for usemap to keep same section with pgdat by later patch.

Signed-off-by: Yasunori Goto
Cc: Badari Pulavarty
Cc: Yinghai Lu
Cc: Yasunori Goto
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Yasunori Goto
2008-04-28 23:58:25 +0800
047532787 memory hotplug: register section/node id to free ... Browse Code »

This patch set is to free pages which is allocated by bootmem for
memory-hotremove. Some structures of memory management are allocated by
bootmem. ex) memmap, etc.

To remove memory physically, some of them must be freed according to
circumstance. This patch set makes basis to free those pages, and free
memmaps.

Basic my idea is using remain members of struct page to remember information
of users of bootmem (section number or node id). When the section is
removing, kernel can confirm it. By this information, some issues can be
solved.

1) When the memmap of removing section is allocated on other
section by bootmem, it should/can be free.
2) When the memmap of removing section is allocated on the
same section, it shouldn't be freed. Because the section has to be
logical memory offlined already and all pages must be isolated against
page allocater. If it is freed, page allocator may use it which will
be removed physically soon.
3) When removing section has other section's memmap,
kernel will be able to show easily which section should be removed
before it for user. (Not implemented yet)
4) When the above case 2), the page isolation will be able to check and skip
memmap's page when logical memory offline (offline_pages()).
Current page isolation code fails in this case because this page is
just reserved page and it can't distinguish this pages can be
removed or not. But, it will be able to do by this patch.
(Not implemented yet.)
5) The node information like pgdat has similar issues. But, this
will be able to be solved too by this.
(Not implemented yet, but, remembering node id in the pages.)

Fortunately, current bootmem allocator just keeps PageReserved flags,
and doesn't use any other members of page struct. The users of
bootmem doesn't use them too.

This patch:

This is to register information which is node or section's id. Kernel can
distinguish which node/section uses the pages allcated by bootmem. This is
basis for hot-remove sections or nodes.

Signed-off-by: Yasunori Goto
Cc: Badari Pulavarty
Cc: Yinghai Lu
Cc: Yasunori Goto
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Yasunori Goto
2008-04-28 23:58:25 +0800

27 Apr, 2008

3 commits

a5645a61b mm: allow reserve_bootmem() cross nodes ... Browse Code »

split reserve_bootmem_core() into two functions, one which checks
conflicts, and one which sets the bits.

and make reserve_bootmem to loop bdata_list to cross the nodes.

user could be crashkernel and ramdisk..., in case the range provided
by those externalities crosses the nodes.

Signed-off-by: Yinghai Lu
Signed-off-by: Ingo Molnar

Yinghai Lu
2008-04-27 04:51:08 +0800
9a2dc04cf mm: offset align in alloc_bootmem() ... Browse Code »

need offset alignment when node_boot_start's alignment is less than
the alignment required.

use local node_boot_start to match alignment - so don't add extra operation
in search loop.

Signed-off-by: Yinghai Lu
Signed-off-by: Ingo Molnar

Yinghai Lu
2008-04-27 04:51:08 +0800
ad09315ca mm: fix alloc_bootmem_core to use fast searching for all nodes ... Browse Code »

Make the nodes other than node 0 use bdata->last_success for fast
search too.

We need to use __alloc_bootmem_core() for vmemmap allocation for other
nodes when numa and sparsemem/vmemmap are enabled.

Also, make fail_block path increase i with incr only after ALIGN
to avoid extra increase when size is larger than align.

Signed-off-by: Yinghai Lu
Signed-off-by: Ingo Molnar

Yinghai Lu
2008-04-27 04:51:07 +0800

25 Mar, 2008

1 commit

5a982cbc7 mm: fix boundary checking in free_bootmem_core ... Browse Code »

With numa enabled, some callers could have a range of memory on one node
but try to free that on other node. This can cause some pages to be
freed wrongly.

For example: when we try to allocate 128g boot ram early for
gart/swiotlb, and free that range later so gart/swiotlb can get some
range afterwards.

With this patch, we don't need to care which node holds the range, just
loop to call free_bootmem_node for all online nodes.

This patch makes free_bootmem_core() more robust by trimming the sidx
and eidx according the ram range that the node has.

And make the free_bootmem_core handle this out of range case. We could
use bdata_list to make sure the range can be freed for sure. So next
time, we don't need to loop online nodes and could use free_bootmem
directly.

Signed-off-by: Yinghai Lu
Cc: Andi Kleen
Cc: Yasunori Goto
Cc: KAMEZAWA Hiroyuki
Acked-by: Ingo Molnar
Tested-by: Ingo Molnar
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Yinghai Lu
2008-03-25 10:22:19 +0800

08 Feb, 2008

1 commit

72a7fe396 Introduce flags for reserve_bootmem() ... Browse Code »

This patchset adds a flags variable to reserve_bootmem() and uses the
BOOTMEM_EXCLUSIVE flag in crashkernel reservation code to detect collisions
between crashkernel area and already used memory.

This patch:

Change the reserve_bootmem() function to accept a new flag BOOTMEM_EXCLUSIVE.
If that flag is set, the function returns with -EBUSY if the memory already
has been reserved in the past. This is to avoid conflicts.

Because that code runs before SMP initialisation, there's no race condition
inside reserve_bootmem_core().

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix powerpc build]
Signed-off-by: Bernhard Walle
Cc:
Cc: "Eric W. Biederman"
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Bernhard Walle
2008-02-08 00:42:25 +0800

08 Dec, 2006

2 commits

045f147f3 [PATCH] remove EXPORT_UNUSED_SYMBOL'ed symbols ... Browse Code »

In time for 2.6.20, we can get rid of this junk.

Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2006-12-08 00:39:44 +0800
7c309a64d [PATCH] enable booting a NUMA system where some nodes have no memory ... Browse Code »

When booting a NUMA system with nodes that have no memory (eg by limiting
memory), bootmem_alloc_core tried to find pages in an uninitialized
bootmem_map. This caused a null pointer access. This fix adds a check, so
that NULL is returned. That will enable the caller (bootmem_alloc_nopanic)
to alloc memory on other without a panic.

Signed-off-by: Christian Krafft
Cc: Christoph Lameter
Cc: Andy Whitcroft
Cc: Martin Bligh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christian Krafft
2006-12-08 00:39:22 +0800

26 Sep, 2006

6 commits

dfd54cbcc [PATCH] bootmem: use MAX_DMA_ADDRESS instead of LOW32LIMIT ... Browse Code »

Introduce ARCH_LOW_ADDRESS_LIMIT which can be set per architecture to
override the 4GB default limit used by the bootmem allocater within
__alloc_bootmem_low() and __alloc_bootmem_low_node(). E.g. s390 needs a
2GB limit instead of 4GB.

Acked-by: Ingo Molnar
Cc: Martin Schwidefsky
Signed-off-by: Heiko Carstens
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Heiko Carstens
2006-09-26 23:48:49 +0800
f71bf0cac [PATCH] bootmem: miscellaneous coding style fixes ... Browse Code »

It fixes various coding style issues, specially when spaces are useless. For
example '*' go next to the function name.

Signed-off-by: Franck Bui-Huu
Cc: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Franck Bui-Huu
2006-09-26 23:48:45 +0800
bbc7b92e3 [PATCH] bootmem: use pfn/page conversion macros ... Browse Code »

It also creates get_mapsize() helper in order to make the code more readable
when it calculates the boot bitmap size.

Signed-off-by: Franck Bui-Huu
Cc: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Franck Bui-Huu
2006-09-26 23:48:45 +0800
e786e86a5 [PATCH] bootmem: remove useless headers inclusions ... Browse Code »

Signed-off-by: Franck Bui-Huu
Cc: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Franck Bui-Huu
2006-09-26 23:48:45 +0800
bb0923a66 [PATCH] bootmem: limit to 80 columns width ... Browse Code »

Signed-off-by: Franck Bui-Huu
Cc: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Franck Bui-Huu
2006-09-26 23:48:45 +0800
69d49e681 [PATCH] bootmem: mark link_bootmem() as part of the __init section ... Browse Code »

Signed-off-by: Franck Bui-Huu
Cc: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Franck Bui-Huu
2006-09-26 23:48:45 +0800

11 Jul, 2006

1 commit

6d46cc6b9 [PATCH] mm/bootmem.c: EXPORT_UNUSED_SYMBOL ... Browse Code »

This patch marks an unused export as EXPORT_UNUSED_SYMBOL.

Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2006-07-11 04:24:17 +0800

10 Apr, 2006

1 commit

a8062231d [PATCH] x86_64: Handle empty PXMs that only contain hotplug memory ... Browse Code »

The node setup code would try to allocate the node metadata in the node
itself, but that fails if there is no memory in there.

This can happen with memory hotplug when the hotplug area defines an so
far empty node.

Now use bootmem to try to allocate the mem_map in other nodes.

And if it fails don't panic, but just ignore the node.

To make this work I added a new __alloc_bootmem_nopanic function that
does what its name implies.

TBD should try to use nearby nodes here. Currently we just use any.
It's hard to do it better because bootmem doesn't have proper fallback
lists yet.

Signed-off-by: Andi Kleen
Signed-off-by: Linus Torvalds

Andi Kleen
2006-04-10 02:53:16 +0800

28 Mar, 2006

1 commit

679bc9fbb [PATCH] for_each_online_pgdat: for_each_bootmem ... Browse Code »

Add a list_head to bootmem_data_t and make bootmems use it. bootmem list is
sorted by node_boot_start.

Only nodes against which init_bootmem() is called are linked to the list.
(i386 allocates bootmem only from one node(0) not from all online nodes.)

A summary:
1. for_each_online_pgdat() traverses all *online* nodes.
2. alloc_bootmem() allocates memory only from initialized-for-bootmem nodes.

Signed-off-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KAMEZAWA Hiroyuki
2006-03-28 00:44:47 +0800

26 Mar, 2006

1 commit

267b48014 [PATCH] x86_64: Try to allocate node memmap near the end of node ... Browse Code »

This fixes problems with very large nodes (over 128GB) filling up all of
the first 4GB with their mem_map and not leaving enough space for the
swiotlb.

Signed-off-by: Andi Kleen
Signed-off-by: Linus Torvalds

Andi Kleen
2006-03-26 01:10:56 +0800

07 Jan, 2006

2 commits

a226f6c89 [PATCH] FRV: Clean up bootmem allocator's page freeing algorithm ... Browse Code »

The attached patch cleans up the way the bootmem allocator frees pages.

A new function, __free_pages_bootmem(), is provided in mm/page_alloc.c that is
called from mm/bootmem.c to turn pages over to the main allocator. All the
bits of code to initialise pages (clearing PG_reserved and setting the page
count) are moved to here. The checks on page validity are removed, on the
assumption that the struct page arrays will have been prepared correctly.

Signed-off-by: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Howells
2006-01-07 00:33:26 +0800
008857c1a [PATCH] Cleanup bootmem allocator and fix alloc_bootmem_low ... Browse Code »

Patch cleans up the alloc_bootmem fix for swiotlb. Patch removes
alloc_bootmem_*_limit api and fixes alloc_boot_*low api to do the right
thing -- allocate from low32 memory.

Signed-off-by: Ravikiran Thirumalai
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ravikiran G Thirumalai
2006-01-07 00:33:26 +0800

13 Dec, 2005

1 commit

66d43e98e [PATCH] fix in __alloc_bootmem_core() when there is no free page in first node's memory ... Browse Code »

Hitting BUG_ON() in __alloc_bootmem_core() when there is no free page
available in the first node's memory. For the case of kdump on PPC64
(Power 4 machine), the captured kernel is used two memory regions - memory
for TCE tables (tce-base and tce-size at top of RAM and reserved) and
captured kernel memory region (crashk_base and crashk_size). Since we
reserve the memory for the first node, we should be returning from
__alloc_bootmem_core() to search for the next node (pg_dat).

Currently, find_next_zero_bit() is returning the n^th bit (eidx) when there
is no free page. Then, test_bit() is failed since we set 0xff only for the
actual size initially (init_bootmem_core()) even though rounded up to one
page for bdata->node_bootmem_map. We are hitting the BUG_ON after failing
to enter second "for" loop.

Signed-off-by: Haren Myneni
Cc: Andy Whitcroft
Cc: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Haren Myneni
2005-12-13 00:57:45 +0800

30 Oct, 2005

1 commit

b5810039a [PATCH] core remove PageReserved ... Browse Code »

Remove PageReserved() calls from core code by tightening VM_RESERVED
handling in mm/ to cover PageReserved functionality.

PageReserved special casing is removed from get_page and put_page.

All setting and clearing of PageReserved is retained, and it is now flagged
in the page_alloc checks to help ensure we don't introduce any refcount
based freeing of Reserved pages.

MAP_PRIVATE, PROT_WRITE of VM_RESERVED regions is tentatively being
deprecated. We never completely handled it correctly anyway, and is be
reintroduced in future if required (Hugh has a proof of concept).

Once PageReserved() calls are removed from kernel/power/swsusp.c, and all
arch/ and driver code, the Set and Clear calls, and the PG_reserved bit can
be trivially removed.

Last real user of PageReserved is swsusp, which uses PageReserved to
determine whether a struct page points to valid memory or not. This still
needs to be addressed (a generic page_is_ram() should work).

A last caveat: the ZERO_PAGE is now refcounted and managed with rmap (and
thus mapcounted and count towards shared rss). These writes to the struct
page could cause excessive cacheline bouncing on big systems. There are a
number of ways this could be addressed if it is an issue.

Signed-off-by: Nick Piggin

Refcount bug fix for filemap_xip.c

Signed-off-by: Carsten Otte
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2005-10-30 12:40:39 +0800

20 Oct, 2005

1 commit

281dd25cd [PATCH] swiotlb: make sure initial DMA allocations really are in DMA memory ... Browse Code »

This introduces a limit parameter to the core bootmem allocator; The new
parameter indicates that physical memory allocated by the bootmem
allocator should be within the requested limit.

We also introduce alloc_bootmem_low_pages_limit, alloc_bootmem_node_limit,
alloc_bootmem_low_pages_node_limit apis, but alloc_bootmem_low_pages_limit
is the only api used for swiotlb.

The existing alloc_bootmem_low_pages() api could instead have been
changed and made to pass right limit to the core allocator. But that
would make the patch more intrusive for 2.6.14, as other arches use
alloc_bootmem_low_pages(). We may be done that post 2.6.14 as a
cleanup.

With this, swiotlb gets memory within 4G for both x86_64 and ia64
arches.

Signed-off-by: Yasunori Goto
Cc: Ravikiran G Thirumalai
Signed-off-by: Linus Torvalds

Yasunori Goto
2005-10-20 14:11:33 +0800

01 Oct, 2005

1 commit

6e3254c4e Revert "x86-64: Reverse order of bootmem lists" ... Browse Code »

As requested by Thomas Gleixner :

"5d3d0f7704ed0bc7eaca0501eeae3e5da1ea6c87 breaks a couple of ARM
boards, which depend on the historical bootmem allocation order.
There is a cleaner solution around to remove the pgdat list
completely, but this is a topic for post 2.6.14

Andi signalled ACK already."

Signed-off-by: Linus Torvalds

Linus Torvalds
2005-10-01 03:38:27 +0800

13 Sep, 2005

1 commit

5d3d0f770 [PATCH] x86-64: Reverse order of bootmem lists ... Browse Code »

This leads to bootmem allocating first from node 0 instead
of from the last node. This avoids swiotlb allocating on the last node, which
doesn't really work on a machine with >4GB.

Note: there is a better patch around from someone else that gets
rid of the pgdat list completely.

Signed-off-by: Andi Kleen
Signed-off-by: Linus Torvalds

Andi Kleen
2005-09-13 01:49:56 +0800

26 Jun, 2005

2 commits

8c0e33c13 [PATCH] Use ALIGN to remove duplicate code ... Browse Code »

This patch makes use of ALIGN() to remove duplicate round-up code.

Signed-off-by: Nick Wilson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Wilson
2005-06-26 07:25:02 +0800
92aa63a5a [PATCH] kdump: Retrieve saved max pfn ... Browse Code »

This patch retrieves the max_pfn being used by previous kernel and stores it
in a safe location (saved_max_pfn) before it is overwritten due to user
defined memory map. This pfn is used to make sure that user does not try to
read the physical memory beyond saved_max_pfn.

Signed-off-by: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vivek Goyal
2005-06-26 07:24:52 +0800

24 Jun, 2005

1 commit

d41dee369 [PATCH] sparsemem memory model ... Browse Code »

Sparsemem abstracts the use of discontiguous mem_maps[]. This kind of
mem_map[] is needed by discontiguous memory machines (like in the old
CONFIG_DISCONTIGMEM case) as well as memory hotplug systems. Sparsemem
replaces DISCONTIGMEM when enabled, and it is hoped that it can eventually
become a complete replacement.

A significant advantage over DISCONTIGMEM is that it's completely separated
from CONFIG_NUMA. When producing this patch, it became apparent in that NUMA
and DISCONTIG are often confused.

Another advantage is that sparse doesn't require each NUMA node's ranges to be
contiguous. It can handle overlapping ranges between nodes with no problems,
where DISCONTIGMEM currently throws away that memory.

Sparsemem uses an array to provide different pfn_to_page() translations for
each SECTION_SIZE area of physical memory. This is what allows the mem_map[]
to be chopped up.

In order to do quick pfn_to_page() operations, the section number of the page
is encoded in page->flags. Part of the sparsemem infrastructure enables
sharing of these bits more dynamically (at compile-time) between the
page_zone() and sparsemem operations. However, on 32-bit architectures, the
number of bits is quite limited, and may require growing the size of the
page->flags type in certain conditions. Several things might force this to
occur: a decrease in the SECTION_SIZE (if you want to hotplug smaller areas of
memory), an increase in the physical address space, or an increase in the
number of used page->flags.

One thing to note is that, once sparsemem is present, the NUMA node
information no longer needs to be stored in the page->flags. It might provide
speed increases on certain platforms and will be stored there if there is
room. But, if out of room, an alternate (theoretically slower) mechanism is
used.

This patch introduces CONFIG_FLATMEM. It is used in almost all cases where
there used to be an #ifndef DISCONTIG, because SPARSEMEM and DISCONTIGMEM
often have to compile out the same areas of code.

Signed-off-by: Andy Whitcroft
Signed-off-by: Dave Hansen
Signed-off-by: Martin Bligh
Signed-off-by: Adrian Bunk
Signed-off-by: Yasunori Goto
Signed-off-by: Bob Picco
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andy Whitcroft
2005-06-24 00:45:04 +0800

17 Apr, 2005

1 commit

1da177e4c Linux-2.6.12-rc2 ... Browse Code »
212

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

Linus Torvalds
2005-04-17 06:20:36 +0800