26 Sep, 2006

6 commits


11 Jul, 2006

1 commit


10 Apr, 2006

1 commit

  • The node setup code would try to allocate the node metadata in the node
    itself, but that fails if there is no memory in there.

    This can happen with memory hotplug when the hotplug area defines an so
    far empty node.

    Now use bootmem to try to allocate the mem_map in other nodes.

    And if it fails don't panic, but just ignore the node.

    To make this work I added a new __alloc_bootmem_nopanic function that
    does what its name implies.

    TBD should try to use nearby nodes here. Currently we just use any.
    It's hard to do it better because bootmem doesn't have proper fallback
    lists yet.

    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Andi Kleen
     

28 Mar, 2006

1 commit

  • Add a list_head to bootmem_data_t and make bootmems use it. bootmem list is
    sorted by node_boot_start.

    Only nodes against which init_bootmem() is called are linked to the list.
    (i386 allocates bootmem only from one node(0) not from all online nodes.)

    A summary:
    1. for_each_online_pgdat() traverses all *online* nodes.
    2. alloc_bootmem() allocates memory only from initialized-for-bootmem nodes.

    Signed-off-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     

26 Mar, 2006

1 commit


07 Jan, 2006

2 commits

  • The attached patch cleans up the way the bootmem allocator frees pages.

    A new function, __free_pages_bootmem(), is provided in mm/page_alloc.c that is
    called from mm/bootmem.c to turn pages over to the main allocator. All the
    bits of code to initialise pages (clearing PG_reserved and setting the page
    count) are moved to here. The checks on page validity are removed, on the
    assumption that the struct page arrays will have been prepared correctly.

    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • Patch cleans up the alloc_bootmem fix for swiotlb. Patch removes
    alloc_bootmem_*_limit api and fixes alloc_boot_*low api to do the right
    thing -- allocate from low32 memory.

    Signed-off-by: Ravikiran Thirumalai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ravikiran G Thirumalai
     

13 Dec, 2005

1 commit

  • Hitting BUG_ON() in __alloc_bootmem_core() when there is no free page
    available in the first node's memory. For the case of kdump on PPC64
    (Power 4 machine), the captured kernel is used two memory regions - memory
    for TCE tables (tce-base and tce-size at top of RAM and reserved) and
    captured kernel memory region (crashk_base and crashk_size). Since we
    reserve the memory for the first node, we should be returning from
    __alloc_bootmem_core() to search for the next node (pg_dat).

    Currently, find_next_zero_bit() is returning the n^th bit (eidx) when there
    is no free page. Then, test_bit() is failed since we set 0xff only for the
    actual size initially (init_bootmem_core()) even though rounded up to one
    page for bdata->node_bootmem_map. We are hitting the BUG_ON after failing
    to enter second "for" loop.

    Signed-off-by: Haren Myneni
    Cc: Andy Whitcroft
    Cc: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Haren Myneni
     

30 Oct, 2005

1 commit

  • Remove PageReserved() calls from core code by tightening VM_RESERVED
    handling in mm/ to cover PageReserved functionality.

    PageReserved special casing is removed from get_page and put_page.

    All setting and clearing of PageReserved is retained, and it is now flagged
    in the page_alloc checks to help ensure we don't introduce any refcount
    based freeing of Reserved pages.

    MAP_PRIVATE, PROT_WRITE of VM_RESERVED regions is tentatively being
    deprecated. We never completely handled it correctly anyway, and is be
    reintroduced in future if required (Hugh has a proof of concept).

    Once PageReserved() calls are removed from kernel/power/swsusp.c, and all
    arch/ and driver code, the Set and Clear calls, and the PG_reserved bit can
    be trivially removed.

    Last real user of PageReserved is swsusp, which uses PageReserved to
    determine whether a struct page points to valid memory or not. This still
    needs to be addressed (a generic page_is_ram() should work).

    A last caveat: the ZERO_PAGE is now refcounted and managed with rmap (and
    thus mapcounted and count towards shared rss). These writes to the struct
    page could cause excessive cacheline bouncing on big systems. There are a
    number of ways this could be addressed if it is an issue.

    Signed-off-by: Nick Piggin

    Refcount bug fix for filemap_xip.c

    Signed-off-by: Carsten Otte
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

20 Oct, 2005

1 commit

  • This introduces a limit parameter to the core bootmem allocator; The new
    parameter indicates that physical memory allocated by the bootmem
    allocator should be within the requested limit.

    We also introduce alloc_bootmem_low_pages_limit, alloc_bootmem_node_limit,
    alloc_bootmem_low_pages_node_limit apis, but alloc_bootmem_low_pages_limit
    is the only api used for swiotlb.

    The existing alloc_bootmem_low_pages() api could instead have been
    changed and made to pass right limit to the core allocator. But that
    would make the patch more intrusive for 2.6.14, as other arches use
    alloc_bootmem_low_pages(). We may be done that post 2.6.14 as a
    cleanup.

    With this, swiotlb gets memory within 4G for both x86_64 and ia64
    arches.

    Signed-off-by: Yasunori Goto
    Cc: Ravikiran G Thirumalai
    Signed-off-by: Linus Torvalds

    Yasunori Goto
     

01 Oct, 2005

1 commit

  • As requested by Thomas Gleixner :

    "5d3d0f7704ed0bc7eaca0501eeae3e5da1ea6c87 breaks a couple of ARM
    boards, which depend on the historical bootmem allocation order.
    There is a cleaner solution around to remove the pgdat list
    completely, but this is a topic for post 2.6.14

    Andi signalled ACK already."

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

13 Sep, 2005

1 commit

  • This leads to bootmem allocating first from node 0 instead
    of from the last node. This avoids swiotlb allocating on the last node, which
    doesn't really work on a machine with >4GB.

    Note: there is a better patch around from someone else that gets
    rid of the pgdat list completely.

    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Andi Kleen
     

26 Jun, 2005

2 commits

  • This patch makes use of ALIGN() to remove duplicate round-up code.

    Signed-off-by: Nick Wilson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Wilson
     
  • This patch retrieves the max_pfn being used by previous kernel and stores it
    in a safe location (saved_max_pfn) before it is overwritten due to user
    defined memory map. This pfn is used to make sure that user does not try to
    read the physical memory beyond saved_max_pfn.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     

24 Jun, 2005

1 commit

  • Sparsemem abstracts the use of discontiguous mem_maps[]. This kind of
    mem_map[] is needed by discontiguous memory machines (like in the old
    CONFIG_DISCONTIGMEM case) as well as memory hotplug systems. Sparsemem
    replaces DISCONTIGMEM when enabled, and it is hoped that it can eventually
    become a complete replacement.

    A significant advantage over DISCONTIGMEM is that it's completely separated
    from CONFIG_NUMA. When producing this patch, it became apparent in that NUMA
    and DISCONTIG are often confused.

    Another advantage is that sparse doesn't require each NUMA node's ranges to be
    contiguous. It can handle overlapping ranges between nodes with no problems,
    where DISCONTIGMEM currently throws away that memory.

    Sparsemem uses an array to provide different pfn_to_page() translations for
    each SECTION_SIZE area of physical memory. This is what allows the mem_map[]
    to be chopped up.

    In order to do quick pfn_to_page() operations, the section number of the page
    is encoded in page->flags. Part of the sparsemem infrastructure enables
    sharing of these bits more dynamically (at compile-time) between the
    page_zone() and sparsemem operations. However, on 32-bit architectures, the
    number of bits is quite limited, and may require growing the size of the
    page->flags type in certain conditions. Several things might force this to
    occur: a decrease in the SECTION_SIZE (if you want to hotplug smaller areas of
    memory), an increase in the physical address space, or an increase in the
    number of used page->flags.

    One thing to note is that, once sparsemem is present, the NUMA node
    information no longer needs to be stored in the page->flags. It might provide
    speed increases on certain platforms and will be stored there if there is
    room. But, if out of room, an alternate (theoretically slower) mechanism is
    used.

    This patch introduces CONFIG_FLATMEM. It is used in almost all cases where
    there used to be an #ifndef DISCONTIG, because SPARSEMEM and DISCONTIGMEM
    often have to compile out the same areas of code.

    Signed-off-by: Andy Whitcroft
    Signed-off-by: Dave Hansen
    Signed-off-by: Martin Bligh
    Signed-off-by: Adrian Bunk
    Signed-off-by: Yasunori Goto
    Signed-off-by: Bob Picco
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Whitcroft
     

17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds