02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

07 Sep, 2017

1 commit

  • Commit f52407ce2dea ("memory hotplug: alloc page from other node in
    memory online") has introduced N_HIGH_MEMORY checks to only use NUMA
    aware allocations when there is some memory present because the
    respective node might not have any memory yet at the time and so it
    could fail or even OOM.

    Things have changed since then though. Zonelists are now always
    initialized before we do any allocations even for hotplug (see
    959ecc48fc75 ("mm/memory_hotplug.c: fix building of node hotplug
    zonelist")).

    Therefore these checks are not really needed. In fact caller of the
    allocator should never care about whether the node is populated because
    that might change at any time.

    Link: http://lkml.kernel.org/r/20170721143915.14161-10-mhocko@kernel.org
    Signed-off-by: Michal Hocko
    Acked-by: Vlastimil Babka
    Cc: Shaohua Li
    Cc: Joonsoo Kim
    Cc: Johannes Weiner
    Cc: Mel Gorman
    Cc: Toshi Kani
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     

13 Jul, 2017

1 commit

  • __GFP_REPEAT was designed to allow retry-but-eventually-fail semantic to
    the page allocator. This has been true but only for allocations
    requests larger than PAGE_ALLOC_COSTLY_ORDER. It has been always
    ignored for smaller sizes. This is a bit unfortunate because there is
    no way to express the same semantic for those requests and they are
    considered too important to fail so they might end up looping in the
    page allocator for ever, similarly to GFP_NOFAIL requests.

    Now that the whole tree has been cleaned up and accidental or misled
    usage of __GFP_REPEAT flag has been removed for !costly requests we can
    give the original flag a better name and more importantly a more useful
    semantic. Let's rename it to __GFP_RETRY_MAYFAIL which tells the user
    that the allocator would try really hard but there is no promise of a
    success. This will work independent of the order and overrides the
    default allocator behavior. Page allocator users have several levels of
    guarantee vs. cost options (take GFP_KERNEL as an example)

    - GFP_KERNEL & ~__GFP_RECLAIM - optimistic allocation without _any_
    attempt to free memory at all. The most light weight mode which even
    doesn't kick the background reclaim. Should be used carefully because
    it might deplete the memory and the next user might hit the more
    aggressive reclaim

    - GFP_KERNEL & ~__GFP_DIRECT_RECLAIM (or GFP_NOWAIT)- optimistic
    allocation without any attempt to free memory from the current
    context but can wake kswapd to reclaim memory if the zone is below
    the low watermark. Can be used from either atomic contexts or when
    the request is a performance optimization and there is another
    fallback for a slow path.

    - (GFP_KERNEL|__GFP_HIGH) & ~__GFP_DIRECT_RECLAIM (aka GFP_ATOMIC) -
    non sleeping allocation with an expensive fallback so it can access
    some portion of memory reserves. Usually used from interrupt/bh
    context with an expensive slow path fallback.

    - GFP_KERNEL - both background and direct reclaim are allowed and the
    _default_ page allocator behavior is used. That means that !costly
    allocation requests are basically nofail but there is no guarantee of
    that behavior so failures have to be checked properly by callers
    (e.g. OOM killer victim is allowed to fail currently).

    - GFP_KERNEL | __GFP_NORETRY - overrides the default allocator behavior
    and all allocation requests fail early rather than cause disruptive
    reclaim (one round of reclaim in this implementation). The OOM killer
    is not invoked.

    - GFP_KERNEL | __GFP_RETRY_MAYFAIL - overrides the default allocator
    behavior and all allocation requests try really hard. The request
    will fail if the reclaim cannot make any progress. The OOM killer
    won't be triggered.

    - GFP_KERNEL | __GFP_NOFAIL - overrides the default allocator behavior
    and all allocation requests will loop endlessly until they succeed.
    This might be really dangerous especially for larger orders.

    Existing users of __GFP_REPEAT are changed to __GFP_RETRY_MAYFAIL
    because they already had their semantic. No new users are added.
    __alloc_pages_slowpath is changed to bail out for __GFP_RETRY_MAYFAIL if
    there is no progress and we have already passed the OOM point.

    This means that all the reclaim opportunities have been exhausted except
    the most disruptive one (the OOM killer) and a user defined fallback
    behavior is more sensible than keep retrying in the page allocator.

    [akpm@linux-foundation.org: fix arch/sparc/kernel/mdesc.c]
    [mhocko@suse.com: semantic fix]
    Link: http://lkml.kernel.org/r/20170626123847.GM11534@dhcp22.suse.cz
    [mhocko@kernel.org: address other thing spotted by Vlastimil]
    Link: http://lkml.kernel.org/r/20170626124233.GN11534@dhcp22.suse.cz
    Link: http://lkml.kernel.org/r/20170623085345.11304-3-mhocko@kernel.org
    Signed-off-by: Michal Hocko
    Acked-by: Vlastimil Babka
    Cc: Alex Belits
    Cc: Chris Wilson
    Cc: Christoph Hellwig
    Cc: Darrick J. Wong
    Cc: David Daney
    Cc: Johannes Weiner
    Cc: Mel Gorman
    Cc: NeilBrown
    Cc: Ralf Baechle
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     

10 Mar, 2017

1 commit


03 Aug, 2016

1 commit

  • There was only one use of __initdata_refok and __exit_refok

    __init_refok was used 46 times against 82 for __ref.

    Those definitions are obsolete since commit 312b1485fb50 ("Introduce new
    section reference annotations tags: __ref, __refdata, __refconst")

    This patch removes the following compatibility definitions and replaces
    them treewide.

    /* compatibility defines */
    #define __init_refok __ref
    #define __initdata_refok __refdata
    #define __exit_refok __ref

    I can also provide separate patches if necessary.
    (One patch per tree and check in 1 month or 2 to remove old definitions)

    [akpm@linux-foundation.org: coding-style fixes]
    Link: http://lkml.kernel.org/r/1466796271-3043-1-git-send-email-fabf@skynet.be
    Signed-off-by: Fabian Frederick
    Cc: Ingo Molnar
    Cc: Sam Ravnborg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     

18 Mar, 2016

2 commits

  • Most of the mm subsystem uses pr_ so make it consistent.

    Miscellanea:

    - Realign arguments
    - Add missing newline to format
    - kmemleak-test.c has a "kmemleak: " prefix added to the
    "Kmemleak testing" logging message via pr_fmt

    Signed-off-by: Joe Perches
    Acked-by: Tejun Heo [percpu]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Kernel style prefers a single string over split strings when the string is
    'user-visible'.

    Miscellanea:

    - Add a missing newline
    - Realign arguments

    Signed-off-by: Joe Perches
    Acked-by: Tejun Heo [percpu]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     

16 Jan, 2016

1 commit

  • In support of providing struct page for large persistent memory
    capacities, use struct vmem_altmap to change the default policy for
    allocating memory for the memmap array. The default vmemmap_populate()
    allocates page table storage area from the page allocator. Given
    persistent memory capacities relative to DRAM it may not be feasible to
    store the memmap in 'System Memory'. Instead vmem_altmap represents
    pre-allocated "device pages" to satisfy vmemmap_alloc_block_buf()
    requests.

    Signed-off-by: Dan Williams
    Reported-by: kbuild test robot
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Williams
     

22 Jan, 2014

1 commit

  • Switch to memblock interfaces for early memory allocator instead of
    bootmem allocator. No functional change in beahvior than what it is in
    current code from bootmem users points of view.

    Archs already converted to NO_BOOTMEM now directly use memblock
    interfaces instead of bootmem wrappers build on top of memblock. And
    the archs which still uses bootmem, these new apis just fallback to
    exiting bootmem APIs.

    Signed-off-by: Santosh Shilimkar
    Cc: "Rafael J. Wysocki"
    Cc: Arnd Bergmann
    Cc: Christoph Lameter
    Cc: Greg Kroah-Hartman
    Cc: Grygorii Strashko
    Cc: H. Peter Anvin
    Cc: Johannes Weiner
    Cc: KAMEZAWA Hiroyuki
    Cc: Konrad Rzeszutek Wilk
    Cc: Michal Hocko
    Cc: Paul Walmsley
    Cc: Pavel Machek
    Cc: Russell King
    Cc: Tejun Heo
    Cc: Tony Lindgren
    Cc: Yinghai Lu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Santosh Shilimkar
     

30 Apr, 2013

2 commits

  • The sparse code, when asking the architecture to populate the vmemmap,
    specifies the section range as a starting page and a number of pages.

    This is an awkward interface, because none of the arch-specific code
    actually thinks of the range in terms of 'struct page' units and always
    translates it to bytes first.

    In addition, later patches mix huge page and regular page backing for
    the vmemmap. For this, they need to call vmemmap_populate_basepages()
    on sub-section ranges with PAGE_SIZE and PMD_SIZE in mind. But these
    are not necessarily multiples of the 'struct page' size and so this unit
    is too coarse.

    Just translate the section range into bytes once in the generic sparse
    code, then pass byte ranges down the stack.

    Signed-off-by: Johannes Weiner
    Cc: Ben Hutchings
    Cc: Bernhard Schmidt
    Cc: Johannes Weiner
    Cc: Russell King
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Benjamin Herrenschmidt
    Cc: "Luck, Tony"
    Cc: Heiko Carstens
    Acked-by: David S. Miller
    Tested-by: David S. Miller
    Cc: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • Hot-adding memory on x86_64 normally requires huge page allocation.
    When this is done to a VM guest, it's usually because the system is
    already tight on memory, so the request tends to fail. Try to avoid
    this by adding __GFP_REPEAT to the allocation flags.

    Addresses http://bugs.debian.org/699913

    Signed-off-by: Ben Hutchings
    Signed-off-by: Johannes Weiner
    Reported-by: Bernhard Schmidt
    Tested-by: Bernhard Schmidt
    Cc: Russell King
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Benjamin Herrenschmidt
    Cc: "Luck, Tony"
    Cc: Heiko Carstens
    Cc: David Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ben Hutchings
     

31 Oct, 2011

1 commit


02 Nov, 2010

1 commit

  • "gadget", "through", "command", "maintain", "maintain", "controller", "address",
    "between", "initiali[zs]e", "instead", "function", "select", "already",
    "equal", "access", "management", "hierarchy", "registration", "interest",
    "relative", "memory", "offset", "already",

    Signed-off-by: Uwe Kleine-König
    Signed-off-by: Jiri Kosina

    Uwe Kleine-König
     

28 Aug, 2010

1 commit

  • 1. replace find_e820_area with memblock_find_in_range
    2. replace reserve_early with memblock_x86_reserve_range
    3. replace free_early with memblock_x86_free_range.
    4. NO_BOOTMEM will switch to use memblock too.
    5. use _e820, _early wrap in the patch, in following patch, will
    replace them all
    6. because memblock_x86_free_range support partial free, we can remove some special care
    7. Need to make sure that memblock_find_in_range() is called after memblock_x86_fill()
    so adjust some calling later in setup.c::setup_arch()
    -- corruption_check and mptable_update

    -v2: Move reserve_brk() early
    Before fill_memblock_area, to avoid overlap between brk and memblock_find_in_range()
    that could happen We have more then 128 RAM entry in E820 tables, and
    memblock_x86_fill() could use memblock_find_in_range() to find a new place for
    memblock.memory.region array.
    and We don't need to use extend_brk() after fill_memblock_area()
    So move reserve_brk() early before fill_memblock_area().
    -v3: Move find_smp_config early
    To make sure memblock_find_in_range not find wrong place, if BIOS doesn't put mptable
    in right place.
    -v4: Treat RESERVED_KERN as RAM in memblock.memory. and they are already in
    memblock.reserved already..
    use __NOT_KEEP_MEMBLOCK to make sure memblock related code could be freed later.
    -v5: Generic version __memblock_find_in_range() is going from high to low, and for 32bit
    active_region for 32bit does include high pages
    need to replace the limit with memblock.default_alloc_limit, aka get_max_mapped()
    -v6: Use current_limit instead
    -v7: check with MEMBLOCK_ERROR instead of -1ULL or -1L
    -v8: Set memblock_can_resize early to handle EFI with more RAM entries
    -v9: update after kmemleak changes in mainline

    Suggested-by: David S. Miller
    Suggested-by: Benjamin Herrenschmidt
    Suggested-by: Thomas Gleixner
    Signed-off-by: Yinghai Lu
    Signed-off-by: H. Peter Anvin

    Yinghai Lu
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

13 Feb, 2010

2 commits

  • Add vmemmap_alloc_block_buf for mem map only.

    It will fallback to the old way if it cannot get a block that big.

    Before this patch, when a node have 128g ram installed, memmap are
    split into two parts or more.
    [ 0.000000] [ffffea0000000000-ffffea003fffffff] PMD -> [ffff880100600000-ffff88013e9fffff] on node 1
    [ 0.000000] [ffffea0040000000-ffffea006fffffff] PMD -> [ffff88013ec00000-ffff88016ebfffff] on node 1
    [ 0.000000] [ffffea0070000000-ffffea007fffffff] PMD -> [ffff882000600000-ffff8820105fffff] on node 0
    [ 0.000000] [ffffea0080000000-ffffea00bfffffff] PMD -> [ffff882010800000-ffff8820507fffff] on node 0
    [ 0.000000] [ffffea00c0000000-ffffea00dfffffff] PMD -> [ffff882050a00000-ffff8820709fffff] on node 0
    [ 0.000000] [ffffea00e0000000-ffffea00ffffffff] PMD -> [ffff884000600000-ffff8840205fffff] on node 2
    [ 0.000000] [ffffea0100000000-ffffea013fffffff] PMD -> [ffff884020800000-ffff8840607fffff] on node 2
    [ 0.000000] [ffffea0140000000-ffffea014fffffff] PMD -> [ffff884060a00000-ffff8840709fffff] on node 2
    [ 0.000000] [ffffea0150000000-ffffea017fffffff] PMD -> [ffff886000600000-ffff8860305fffff] on node 3
    [ 0.000000] [ffffea0180000000-ffffea01bfffffff] PMD -> [ffff886030800000-ffff8860707fffff] on node 3
    [ 0.000000] [ffffea01c0000000-ffffea01ffffffff] PMD -> [ffff888000600000-ffff8880405fffff] on node 4
    [ 0.000000] [ffffea0200000000-ffffea022fffffff] PMD -> [ffff888040800000-ffff8880707fffff] on node 4
    [ 0.000000] [ffffea0230000000-ffffea023fffffff] PMD -> [ffff88a000600000-ffff88a0105fffff] on node 5
    [ 0.000000] [ffffea0240000000-ffffea027fffffff] PMD -> [ffff88a010800000-ffff88a0507fffff] on node 5
    [ 0.000000] [ffffea0280000000-ffffea029fffffff] PMD -> [ffff88a050a00000-ffff88a0709fffff] on node 5
    [ 0.000000] [ffffea02a0000000-ffffea02bfffffff] PMD -> [ffff88c000600000-ffff88c0205fffff] on node 6
    [ 0.000000] [ffffea02c0000000-ffffea02ffffffff] PMD -> [ffff88c020800000-ffff88c0607fffff] on node 6
    [ 0.000000] [ffffea0300000000-ffffea030fffffff] PMD -> [ffff88c060a00000-ffff88c0709fffff] on node 6
    [ 0.000000] [ffffea0310000000-ffffea033fffffff] PMD -> [ffff88e000600000-ffff88e0305fffff] on node 7
    [ 0.000000] [ffffea0340000000-ffffea037fffffff] PMD -> [ffff88e030800000-ffff88e0707fffff] on node 7

    after patch will get
    [ 0.000000] [ffffea0000000000-ffffea006fffffff] PMD -> [ffff880100200000-ffff88016e5fffff] on node 0
    [ 0.000000] [ffffea0070000000-ffffea00dfffffff] PMD -> [ffff882000200000-ffff8820701fffff] on node 1
    [ 0.000000] [ffffea00e0000000-ffffea014fffffff] PMD -> [ffff884000200000-ffff8840701fffff] on node 2
    [ 0.000000] [ffffea0150000000-ffffea01bfffffff] PMD -> [ffff886000200000-ffff8860701fffff] on node 3
    [ 0.000000] [ffffea01c0000000-ffffea022fffffff] PMD -> [ffff888000200000-ffff8880701fffff] on node 4
    [ 0.000000] [ffffea0230000000-ffffea029fffffff] PMD -> [ffff88a000200000-ffff88a0701fffff] on node 5
    [ 0.000000] [ffffea02a0000000-ffffea030fffffff] PMD -> [ffff88c000200000-ffff88c0701fffff] on node 6
    [ 0.000000] [ffffea0310000000-ffffea037fffffff] PMD -> [ffff88e000200000-ffff88e0701fffff] on node 7

    -v2: change buf to vmemmap_buf instead according to Ingo
    also add CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER according to Ingo
    -v3: according to Andrew, use sizeof(name) instead of hard coded 15

    Signed-off-by: Yinghai Lu
    LKML-Reference:
    Cc: Christoph Lameter
    Acked-by: Christoph Lameter
    Signed-off-by: H. Peter Anvin

    Yinghai Lu
     
  • Finally we can use early_res to replace bootmem for x86_64 now.

    Still can use CONFIG_NO_BOOTMEM to enable it or not.

    -v2: fix 32bit compiling about MAX_DMA32_PFN
    -v3: folded bug fix from LKML message below

    Signed-off-by: Yinghai Lu
    LKML-Reference:
    Signed-off-by: H. Peter Anvin

    Yinghai Lu
     

22 Sep, 2009

1 commit

  • To initialize hotadded node, some pages are allocated. At that time, the
    node hasn't memory, this makes the allocation always fail. In such case,
    let's allocate pages from other nodes.

    Signed-off-by: Shaohua Li
    Signed-off-by: Yakui Zhao
    Cc: Mel Gorman
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Shaohua Li
     

07 Nov, 2008

1 commit


05 Jul, 2008

1 commit

  • Remove all clameter@sgi.com addresses from the kernel tree since they will
    become invalid on June 27th. Change my maintainer email address for the
    slab allocators to cl@linux-foundation.org (which will be the new email
    address for the future).

    Signed-off-by: Christoph Lameter
    Signed-off-by: Christoph Lameter
    Cc: Pekka Enberg
    Cc: Stephen Rothwell
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

31 Mar, 2008

1 commit


30 Nov, 2007

1 commit


30 Oct, 2007

1 commit

  • mm/sparse-vmemmap.c uses init_mm in some places. However, it is not
    present in any of the headers currently included in the file.

    init_mm is defined as extern in sched.h, so we add it to the headers list

    Up to now, this problem was masked by the fact that functions like
    set_pte_at() and pmd_populate_kernel() are usually macros that expand to
    simpler variants that does not use the first parameter at all.

    Signed-off-by: Glauber de Oliveira Costa
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Glauber de Oliveira Costa
     

17 Oct, 2007

3 commits

  • This patch is to avoid panic when memory hot-add is executed with
    sparsemem-vmemmap. Current vmemmap-sparsemem code doesn't support memory
    hot-add. Vmemmap must be populated when hot-add. This is for
    2.6.23-rc2-mm2.

    Todo: # Even if this patch is applied, the message "[xxxx-xxxx] potential
    offnode page_structs" is displayed. To allocate memmap on its node,
    memmap (and pgdat) must be initialized itself like chicken and
    egg relationship.

    # vmemmap_unpopulate will be necessary for followings.
    - For cancel hot-add due to error.
    - For unplug.

    Signed-off-by: Yasunori Goto
    Cc: Andy Whitcroft
    Cc: Christoph Lameter
    Cc: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yasunori Goto
     
  • Convert the common vmemmap population into initialisation helpers for use by
    architecture vmemmap populators. All architecture implementing the
    SPARSEMEM_VMEMMAP variant supply an architecture specific vmemmap_populate()
    initialiser, which may make use of the helpers.

    This allows us to clean up and remove the initialisation Kconfig entries.
    With this patch there is a single SPARSEMEM_VMEMMAP_ENABLE Kconfig option to
    indicate use of that variant.

    Signed-off-by: Andy Whitcroft
    Acked-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Whitcroft
     
  • SPARSEMEM is a pretty nice framework that unifies quite a bit of code over all
    the arches. It would be great if it could be the default so that we can get
    rid of various forms of DISCONTIG and other variations on memory maps. So far
    what has hindered this are the additional lookups that SPARSEMEM introduces
    for virt_to_page and page_address. This goes so far that the code to do this
    has to be kept in a separate function and cannot be used inline.

    This patch introduces a virtual memmap mode for SPARSEMEM, in which the memmap
    is mapped into a virtually contigious area, only the active sections are
    physically backed. This allows virt_to_page page_address and cohorts become
    simple shift/add operations. No page flag fields, no table lookups, nothing
    involving memory is required.

    The two key operations pfn_to_page and page_to_page become:

    #define __pfn_to_page(pfn) (vmemmap + (pfn))
    #define __page_to_pfn(page) ((page) - vmemmap)

    By having a virtual mapping for the memmap we allow simple access without
    wasting physical memory. As kernel memory is typically already mapped 1:1
    this introduces no additional overhead. The virtual mapping must be big
    enough to allow a struct page to be allocated and mapped for all valid
    physical pages. This vill make a virtual memmap difficult to use on 32 bit
    platforms that support 36 address bits.

    However, if there is enough virtual space available and the arch already maps
    its 1-1 kernel space using TLBs (f.e. true of IA64 and x86_64) then this
    technique makes SPARSEMEM lookups even more efficient than CONFIG_FLATMEM.
    FLATMEM needs to read the contents of the mem_map variable to get the start of
    the memmap and then add the offset to the required entry. vmemmap is a
    constant to which we can simply add the offset.

    This patch has the potential to allow us to make SPARSMEM the default (and
    even the only) option for most systems. It should be optimal on UP, SMP and
    NUMA on most platforms. Then we may even be able to remove the other memory
    models: FLATMEM, DISCONTIG etc.

    [apw@shadowen.org: config cleanups, resplit code etc]
    [kamezawa.hiroyu@jp.fujitsu.com: Fix sparsemem_vmemmap init]
    [apw@shadowen.org: vmemmap: remove excess debugging]
    [apw@shadowen.org: simplify initialisation code and reduce duplication]
    [apw@shadowen.org: pull out the vmemmap code into its own file]
    Signed-off-by: Christoph Lameter
    Signed-off-by: Andy Whitcroft
    Acked-by: Mel Gorman
    Cc: "Luck, Tony"
    Cc: Andi Kleen
    Cc: "David S. Miller"
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter