12 Jan, 2012

1 commit

  • * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/numa: Add constraints check for nid parameters
    mm, x86: Remove debug_pagealloc_enabled
    x86/mm: Initialize high mem before free_all_bootmem()
    arch/x86/kernel/e820.c: quiet sparse noise about plain integer as NULL pointer
    arch/x86/kernel/e820.c: Eliminate bubble sort from sanitize_e820_map()
    x86: Fix mmap random address range
    x86, mm: Unify zone_sizes_init()
    x86, mm: Prepare zone_sizes_init() for unification
    x86, mm: Use max_low_pfn for ZONE_NORMAL on 64-bit
    x86, mm: Wrap ZONE_DMA32 with CONFIG_ZONE_DMA32
    x86, mm: Use max_pfn instead of highend_pfn
    x86, mm: Move zone init from paging_init() on 64-bit
    x86, mm: Use MAX_DMA_PFN for ZONE_DMA on 32-bit

    Linus Torvalds
     

29 Nov, 2011

1 commit

  • Conflicts & resolutions:

    * arch/x86/xen/setup.c

    dc91c728fd "xen: allow extra memory to be in multiple regions"
    24aa07882b "memblock, x86: Replace memblock_x86_reserve/free..."

    conflicted on xen_add_extra_mem() updates. The resolution is
    trivial as the latter just want to replace
    memblock_x86_reserve_range() with memblock_reserve().

    * drivers/pci/intel-iommu.c

    166e9278a3f "x86/ia64: intel-iommu: move to drivers/iommu/"
    5dfe8660a3d "bootmem: Replace work_with_active_regions() with..."

    conflicted as the former moved the file under drivers/iommu/.
    Resolved by applying the chnages from the latter on the moved
    file.

    * mm/Kconfig

    6661672053a "memblock: add NO_BOOTMEM config symbol"
    c378ddd53f9 "memblock, x86: Make ARCH_DISCARD_MEMBLOCK a config option"

    conflicted trivially. Both added config options. Just
    letting both add their own options resolves the conflict.

    * mm/memblock.c

    d1f0ece6cdc "mm/memblock.c: small function definition fixes"
    ed7b56a799c "memblock: Remove memblock_memory_can_coalesce()"

    confliected. The former updates function removed by the
    latter. Resolution is trivial.

    Signed-off-by: Tejun Heo

    Tejun Heo
     

11 Nov, 2011

1 commit

  • Now that zone_sizes_init() is identical on 32-bit and 64-bit,
    move the code to arch/x86/mm/init.c and use it for both
    architectures.

    Acked-by: Tejun Heo
    Acked-by: Yinghai Lu
    Signed-off-by: Pekka Enberg
    Link: http://lkml.kernel.org/r/1320155902-10424-7-git-send-email-penberg@kernel.org
    Signed-off-by: Ingo Molnar

    Pekka Enberg
     

24 Oct, 2011

1 commit

  • Commit 4b239f458 ("x86-64, mm: Put early page table high") causes a S4
    regression since 2.6.39, namely the machine reboots occasionally at S4
    resume. It doesn't happen always, overall rate is about 1/20. But,
    like other bugs, once when this happens, it continues to happen.

    This patch fixes the problem by essentially reverting the memory
    assignment in the older way.

    Signed-off-by: Takashi Iwai
    Cc:
    Cc: Rafael J. Wysocki
    Cc: Yinghai Lu
    [ We'll hopefully find the real fix, but that's too late for 3.1 now ]
    Signed-off-by: Linus Torvalds

    Takashi Iwai
     

15 Jul, 2011

1 commit

  • Other than sanity check and debug message, the x86 specific version of
    memblock reserve/free functions are simple wrappers around the generic
    versions - memblock_reserve/free().

    This patch adds debug messages with caller identification to the
    generic versions and replaces x86 specific ones and kills them.
    arch/x86/include/asm/memblock.h and arch/x86/mm/memblock.c are empty
    after this change and removed.

    Signed-off-by: Tejun Heo
    Link: http://lkml.kernel.org/r/1310462166-31469-14-git-send-email-tj@kernel.org
    Cc: Yinghai Lu
    Cc: Benjamin Herrenschmidt
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Signed-off-by: H. Peter Anvin

    Tejun Heo
     

14 Jul, 2011

1 commit

  • 25818f0f28 (memblock: Make MEMBLOCK_ERROR be 0) thankfully made
    MEMBLOCK_ERROR 0 and there already are codes which expect error return
    to be 0. There's no point in keeping MEMBLOCK_ERROR around. End its
    misery.

    Signed-off-by: Tejun Heo
    Link: http://lkml.kernel.org/r/1310457490-3356-6-git-send-email-tj@kernel.org
    Cc: Yinghai Lu
    Cc: Benjamin Herrenschmidt
    Signed-off-by: H. Peter Anvin

    Tejun Heo
     

25 May, 2011

1 commit

  • Fold all the mmu_gather rework patches into one for submission

    Signed-off-by: Peter Zijlstra
    Reported-by: Hugh Dickins
    Cc: Benjamin Herrenschmidt
    Cc: David Miller
    Cc: Martin Schwidefsky
    Cc: Russell King
    Cc: Paul Mundt
    Cc: Jeff Dike
    Cc: Richard Weinberger
    Cc: Tony Luck
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: KOSAKI Motohiro
    Cc: Nick Piggin
    Cc: Namhyung Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

13 May, 2011

2 commits

  • With CONFIG_DEBUG_SECTION_MISMATCH=y I see these warnings in next-20110415:

    LD vmlinux.o
    MODPOST vmlinux.o
    WARNING: vmlinux.o(.text+0x1ba48): Section mismatch in reference from the function native_pagetable_reserve() to the function .init.text:memblock_x86_reserve_range()
    The function native_pagetable_reserve() references
    the function __init memblock_x86_reserve_range().
    This is often because native_pagetable_reserve lacks a __init
    annotation or the annotation of memblock_x86_reserve_range is wrong.

    This patch fixes the issue.
    Thanks to pipacs from PaX project for help on IRC.

    Acked-by: "H. Peter Anvin"
    Signed-off-by: Sedat Dilek
    Signed-off-by: Konrad Rzeszutek Wilk

    Sedat Dilek
     
  • Introduce a new x86_init hook called pagetable_reserve that at the end
    of init_memory_mapping is used to reserve a range of memory addresses for
    the kernel pagetable pages we used and free the other ones.

    On native it just calls memblock_x86_reserve_range while on xen it also
    takes care of setting the spare memory previously allocated
    for kernel pagetable pages from RO to RW, so that it can be used for
    other purposes.

    A detailed explanation of the reason why this hook is needed follows.

    As a consequence of the commit:

    commit 4b239f458c229de044d6905c2b0f9fe16ed9e01e
    Author: Yinghai Lu
    Date: Fri Dec 17 16:58:28 2010 -0800

    x86-64, mm: Put early page table high

    at some point init_memory_mapping is going to reach the pagetable pages
    area and map those pages too (mapping them as normal memory that falls
    in the range of addresses passed to init_memory_mapping as argument).
    Some of those pages are already pagetable pages (they are in the range
    pgt_buf_start-pgt_buf_end) therefore they are going to be mapped RO and
    everything is fine.
    Some of these pages are not pagetable pages yet (they fall in the range
    pgt_buf_end-pgt_buf_top; for example the page at pgt_buf_end) so they
    are going to be mapped RW. When these pages become pagetable pages and
    are hooked into the pagetable, xen will find that the guest has already
    a RW mapping of them somewhere and fail the operation.
    The reason Xen requires pagetables to be RO is that the hypervisor needs
    to verify that the pagetables are valid before using them. The validation
    operations are called "pinning" (more details in arch/x86/xen/mmu.c).

    In order to fix the issue we mark all the pages in the entire range
    pgt_buf_start-pgt_buf_top as RO, however when the pagetable allocation
    is completed only the range pgt_buf_start-pgt_buf_end is reserved by
    init_memory_mapping. Hence the kernel is going to crash as soon as one
    of the pages in the range pgt_buf_end-pgt_buf_top is reused (b/c those
    ranges are RO).

    For this reason we need a hook to reserve the kernel pagetable pages we
    used and free the other ones so that they can be reused for other
    purposes.
    On native it just means calling memblock_x86_reserve_range, on Xen it
    also means marking RW the pagetable pages that we allocated before but
    that haven't been used before.

    Another way to fix this is without using the hook is by adding a 'if
    (xen_pv_domain)' in the 'init_memory_mapping' code and calling the Xen
    counterpart, but that is just nasty.

    Signed-off-by: Stefano Stabellini
    Acked-by: Yinghai Lu
    Acked-by: H. Peter Anvin
    Cc: Ingo Molnar
    Signed-off-by: Konrad Rzeszutek Wilk

    Stefano Stabellini
     

24 Feb, 2011

1 commit

  • e820_table_{start|end|top}, which are used to buffer page table
    allocation during early boot, are now derived from memblock and don't
    have much to do with e820. Change the names so that they reflect what
    they're used for.

    This patch doesn't introduce any behavior change.

    -v2: Ingo found that earlier patch "x86: Use early pre-allocated page
    table buffer top-down" caused crash on 32bit and needed to be
    dropped. This patch was updated to reflect the change.

    -tj: Updated commit description.

    Signed-off-by: Yinghai Lu
    Signed-off-by: Tejun Heo

    Yinghai Lu
     

14 Feb, 2011

1 commit


05 Jan, 2011

1 commit


30 Dec, 2010

2 commits

  • Introduce init_memory_mapping_high(), and use it with 64bit.

    It will go with every memory segment above 4g to create page table to the
    memory range itself.

    before this patch all page tables was on one node.

    with this patch, one RED-PEN is killed

    debug out for 8 sockets system after patch
    [ 0.000000] initial memory mapped : 0 - 20000000
    [ 0.000000] init_memory_mapping: [0x00000000000000-0x0000007f74ffff]
    [ 0.000000] 0000000000 - 007f600000 page 2M
    [ 0.000000] 007f600000 - 007f750000 page 4k
    [ 0.000000] kernel direct mapping tables up to 7f750000 @ [0x7f74c000-0x7f74ffff]
    [ 0.000000] RAMDISK: 7bc84000 - 7f745000
    ....
    [ 0.000000] Adding active range (0, 0x10, 0x95) 0 entries of 3200 used
    [ 0.000000] Adding active range (0, 0x100, 0x7f750) 1 entries of 3200 used
    [ 0.000000] Adding active range (0, 0x100000, 0x1080000) 2 entries of 3200 used
    [ 0.000000] Adding active range (1, 0x1080000, 0x2080000) 3 entries of 3200 used
    [ 0.000000] Adding active range (2, 0x2080000, 0x3080000) 4 entries of 3200 used
    [ 0.000000] Adding active range (3, 0x3080000, 0x4080000) 5 entries of 3200 used
    [ 0.000000] Adding active range (4, 0x4080000, 0x5080000) 6 entries of 3200 used
    [ 0.000000] Adding active range (5, 0x5080000, 0x6080000) 7 entries of 3200 used
    [ 0.000000] Adding active range (6, 0x6080000, 0x7080000) 8 entries of 3200 used
    [ 0.000000] Adding active range (7, 0x7080000, 0x8080000) 9 entries of 3200 used
    [ 0.000000] init_memory_mapping: [0x00000100000000-0x0000107fffffff]
    [ 0.000000] 0100000000 - 1080000000 page 2M
    [ 0.000000] kernel direct mapping tables up to 1080000000 @ [0x107ffbd000-0x107fffffff]
    [ 0.000000] memblock_x86_reserve_range: [0x107ffc2000-0x107fffffff] PGTABLE
    [ 0.000000] init_memory_mapping: [0x00001080000000-0x0000207fffffff]
    [ 0.000000] 1080000000 - 2080000000 page 2M
    [ 0.000000] kernel direct mapping tables up to 2080000000 @ [0x207ff7d000-0x207fffffff]
    [ 0.000000] memblock_x86_reserve_range: [0x207ffc0000-0x207fffffff] PGTABLE
    [ 0.000000] init_memory_mapping: [0x00002080000000-0x0000307fffffff]
    [ 0.000000] 2080000000 - 3080000000 page 2M
    [ 0.000000] kernel direct mapping tables up to 3080000000 @ [0x307ff3d000-0x307fffffff]
    [ 0.000000] memblock_x86_reserve_range: [0x307ffc0000-0x307fffffff] PGTABLE
    [ 0.000000] init_memory_mapping: [0x00003080000000-0x0000407fffffff]
    [ 0.000000] 3080000000 - 4080000000 page 2M
    [ 0.000000] kernel direct mapping tables up to 4080000000 @ [0x407fefd000-0x407fffffff]
    [ 0.000000] memblock_x86_reserve_range: [0x407ffc0000-0x407fffffff] PGTABLE
    [ 0.000000] init_memory_mapping: [0x00004080000000-0x0000507fffffff]
    [ 0.000000] 4080000000 - 5080000000 page 2M
    [ 0.000000] kernel direct mapping tables up to 5080000000 @ [0x507febd000-0x507fffffff]
    [ 0.000000] memblock_x86_reserve_range: [0x507ffc0000-0x507fffffff] PGTABLE
    [ 0.000000] init_memory_mapping: [0x00005080000000-0x0000607fffffff]
    [ 0.000000] 5080000000 - 6080000000 page 2M
    [ 0.000000] kernel direct mapping tables up to 6080000000 @ [0x607fe7d000-0x607fffffff]
    [ 0.000000] memblock_x86_reserve_range: [0x607ffc0000-0x607fffffff] PGTABLE
    [ 0.000000] init_memory_mapping: [0x00006080000000-0x0000707fffffff]
    [ 0.000000] 6080000000 - 7080000000 page 2M
    [ 0.000000] kernel direct mapping tables up to 7080000000 @ [0x707fe3d000-0x707fffffff]
    [ 0.000000] memblock_x86_reserve_range: [0x707ffc0000-0x707fffffff] PGTABLE
    [ 0.000000] init_memory_mapping: [0x00007080000000-0x0000807fffffff]
    [ 0.000000] 7080000000 - 8080000000 page 2M
    [ 0.000000] kernel direct mapping tables up to 8080000000 @ [0x807fdfc000-0x807fffffff]
    [ 0.000000] memblock_x86_reserve_range: [0x807ffbf000-0x807fffffff] PGTABLE
    [ 0.000000] Initmem setup node 0 [0000000000000000-000000107fffffff]
    [ 0.000000] NODE_DATA [0x0000107ffbd000-0x0000107ffc1fff]
    [ 0.000000] Initmem setup node 1 [0000001080000000-000000207fffffff]
    [ 0.000000] NODE_DATA [0x0000207ffbb000-0x0000207ffbffff]
    [ 0.000000] Initmem setup node 2 [0000002080000000-000000307fffffff]
    [ 0.000000] NODE_DATA [0x0000307ffbb000-0x0000307ffbffff]
    [ 0.000000] Initmem setup node 3 [0000003080000000-000000407fffffff]
    [ 0.000000] NODE_DATA [0x0000407ffbb000-0x0000407ffbffff]
    [ 0.000000] Initmem setup node 4 [0000004080000000-000000507fffffff]
    [ 0.000000] NODE_DATA [0x0000507ffbb000-0x0000507ffbffff]
    [ 0.000000] Initmem setup node 5 [0000005080000000-000000607fffffff]
    [ 0.000000] NODE_DATA [0x0000607ffbb000-0x0000607ffbffff]
    [ 0.000000] Initmem setup node 6 [0000006080000000-000000707fffffff]
    [ 0.000000] NODE_DATA [0x0000707ffbb000-0x0000707ffbffff]
    [ 0.000000] Initmem setup node 7 [0000007080000000-000000807fffffff]
    [ 0.000000] NODE_DATA [0x0000807ffba000-0x0000807ffbefff]

    Signed-off-by: Yinghai Lu
    LKML-Reference:
    Signed-off-by: H. Peter Anvin

    Yinghai Lu
     
  • While dubug kdump, found current kernel will have problem with crashkernel=512M.

    It turns out that initial mapping is to 512M, and later initial mapping to 4G
    (acutally is 2040M in my platform), will put page table near 512M.
    then initial mapping to 128g will be near 2g.

    before this patch:
    [ 0.000000] initial memory mapped : 0 - 20000000
    [ 0.000000] init_memory_mapping: [0x00000000000000-0x0000007f74ffff]
    [ 0.000000] 0000000000 - 007f600000 page 2M
    [ 0.000000] 007f600000 - 007f750000 page 4k
    [ 0.000000] kernel direct mapping tables up to 7f750000 @ [0x1fffc000-0x1fffffff]
    [ 0.000000] memblock_x86_reserve_range: [0x1fffc000-0x1fffdfff] PGTABLE
    [ 0.000000] init_memory_mapping: [0x00000100000000-0x0000207fffffff]
    [ 0.000000] 0100000000 - 2080000000 page 2M
    [ 0.000000] kernel direct mapping tables up to 2080000000 @ [0x7bc01000-0x7bc83fff]
    [ 0.000000] memblock_x86_reserve_range: [0x7bc01000-0x7bc7efff] PGTABLE
    [ 0.000000] RAMDISK: 7bc84000 - 7f745000
    [ 0.000000] crashkernel reservation failed - No suitable area found.

    after patch:
    [ 0.000000] initial memory mapped : 0 - 20000000
    [ 0.000000] init_memory_mapping: [0x00000000000000-0x0000007f74ffff]
    [ 0.000000] 0000000000 - 007f600000 page 2M
    [ 0.000000] 007f600000 - 007f750000 page 4k
    [ 0.000000] kernel direct mapping tables up to 7f750000 @ [0x7f74c000-0x7f74ffff]
    [ 0.000000] memblock_x86_reserve_range: [0x7f74c000-0x7f74dfff] PGTABLE
    [ 0.000000] init_memory_mapping: [0x00000100000000-0x0000207fffffff]
    [ 0.000000] 0100000000 - 2080000000 page 2M
    [ 0.000000] kernel direct mapping tables up to 2080000000 @ [0x207ff7d000-0x207fffffff]
    [ 0.000000] memblock_x86_reserve_range: [0x207ff7d000-0x207fffafff] PGTABLE
    [ 0.000000] RAMDISK: 7bc84000 - 7f745000
    [ 0.000000] memblock_x86_reserve_range: [0x17000000-0x36ffffff] CRASH KERNEL
    [ 0.000000] Reserving 512MB of memory at 368MB for crashkernel (System RAM: 133120MB)

    It means with the patch, page table for [0, 2g) will need 2g, instead of under 512M,
    page table for [4g, 128g) will be near 128g, instead of under 2g.

    That would good, if we have lots of memory above 4g, like 1024g, or 2048g or 16T, will not put
    related page table under 2g. that would be have chance to fill the under 2g if 1G or 2M page is
    not used.

    the code change will use add map_low_page() and update unmap_low_page() for 64bit, and use them
    to get access the corresponding high memory for page table setting.

    Signed-off-by: Yinghai Lu
    LKML-Reference:
    Signed-off-by: H. Peter Anvin

    Yinghai Lu
     

18 Nov, 2010

1 commit

  • This patch expands functionality of CONFIG_DEBUG_RODATA to set main
    (static) kernel data area as NX.

    The following steps are taken to achieve this:

    1. Linker script is adjusted so .text always starts and ends on a page bound
    2. Linker script is adjusted so .rodata always start and end on a page boundary
    3. NX is set for all pages from _etext through _end in mark_rodata_ro.
    4. free_init_pages() sets released memory NX in arch/x86/mm/init.c
    5. bios rom is set to x when pcibios is used.

    The results of patch application may be observed in the diff of kernel page
    table dumps:

    pcibios:

    -- data_nx_pt_before.txt 2009-10-13 07:48:59.000000000 -0400
    ++ data_nx_pt_after.txt 2009-10-13 07:26:46.000000000 -0400
    0x00000000-0xc0000000 3G pmd
    ---[ Kernel Mapping ]---
    -0xc0000000-0xc0100000 1M RW GLB x pte
    +0xc0000000-0xc00a0000 640K RW GLB NX pte
    +0xc00a0000-0xc0100000 384K RW GLB x pte
    -0xc0100000-0xc03d7000 2908K ro GLB x pte
    +0xc0100000-0xc0318000 2144K ro GLB x pte
    +0xc0318000-0xc03d7000 764K ro GLB NX pte
    -0xc03d7000-0xc0600000 2212K RW GLB x pte
    +0xc03d7000-0xc0600000 2212K RW GLB NX pte
    0xc0600000-0xf7a00000 884M RW PSE GLB NX pmd
    0xf7a00000-0xf7bfe000 2040K RW GLB NX pte
    0xf7bfe000-0xf7c00000 8K pte

    No pcibios:

    -- data_nx_pt_before.txt 2009-10-13 07:48:59.000000000 -0400
    ++ data_nx_pt_after.txt 2009-10-13 07:26:46.000000000 -0400
    0x00000000-0xc0000000 3G pmd
    ---[ Kernel Mapping ]---
    -0xc0000000-0xc0100000 1M RW GLB x pte
    +0xc0000000-0xc0100000 1M RW GLB NX pte
    -0xc0100000-0xc03d7000 2908K ro GLB x pte
    +0xc0100000-0xc0318000 2144K ro GLB x pte
    +0xc0318000-0xc03d7000 764K ro GLB NX pte
    -0xc03d7000-0xc0600000 2212K RW GLB x pte
    +0xc03d7000-0xc0600000 2212K RW GLB NX pte
    0xc0600000-0xf7a00000 884M RW PSE GLB NX pmd
    0xf7a00000-0xf7bfe000 2040K RW GLB NX pte
    0xf7bfe000-0xf7c00000 8K pte

    The patch has been originally developed for Linux 2.6.34-rc2 x86 by
    Siarhei Liakh and Xuxian Jiang .

    -v1: initial patch for 2.6.30
    -v2: patch for 2.6.31-rc7
    -v3: moved all code into arch/x86, adjusted credits
    -v4: fixed ifdef, removed credits from CREDITS
    -v5: fixed an address calculation bug in mark_nxdata_nx()
    -v6: added acked-by and PT dump diff to commit log
    -v7: minor adjustments for -tip
    -v8: rework with the merge of "Set first MB as RW+NX"

    Signed-off-by: Siarhei Liakh
    Signed-off-by: Xuxian Jiang
    Signed-off-by: Matthieu CASTET
    Cc: Arjan van de Ven
    Cc: James Morris
    Cc: Andi Kleen
    Cc: Rusty Russell
    Cc: Stephen Rothwell
    Cc: Dave Jones
    Cc: Kees Cook
    Cc: Linus Torvalds
    LKML-Reference:
    [ minor cleanliness edits ]
    Signed-off-by: Ingo Molnar

    Matthieu Castet
     

28 Aug, 2010

1 commit


05 Apr, 2010

1 commit


30 Mar, 2010

2 commits

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     
  • When CONFIG_NO_BOOTMEM=y, it could use memory more effiently, or
    in a more compact fashion.

    Example:

    Allocated new RAMDISK: 00ec2000 - 0248ce57
    Move RAMDISK from 000000002ea04000 - 000000002ffcee56 to 00ec2000 - 0248ce56

    The new RAMDISK's end is not page aligned.
    Last page could be shared with other users.

    When free_init_pages are called for initrd or .init, the page
    could be freed and we could corrupt other data.

    code segment in free_init_pages():

    | for (; addr < end; addr += PAGE_SIZE) {
    | ClearPageReserved(virt_to_page(addr));
    | init_page_count(virt_to_page(addr));
    | memset((void *)(addr & ~(PAGE_SIZE-1)),
    | POISON_FREE_INITMEM, PAGE_SIZE);
    | free_page(addr);
    | totalram_pages++;
    | }

    last half page could be used as one whole free page.

    So page align the boundaries.

    -v2: make the original initramdisk to be aligned, according to
    Johannes, otherwise we have the chance to lose one page.
    we still need to keep initrd_end not aligned, otherwise it could
    confuse decompressor.
    -v3: change to WARN_ON instead, suggested by Johannes.
    -v4: use PAGE_ALIGN, suggested by Johannes.
    We may fix that macro name later to PAGE_ALIGN_UP, and PAGE_ALIGN_DOWN
    Add comments about assuming ramdisk start is aligned
    in relocate_initrd(), change to re get ramdisk_image instead of save it
    to make diff smaller. Add warning for wrong range, suggested by Johannes.
    -v6: remove one WARN()
    We need to align beginning in free_init_pages()
    do not copy more than ramdisk_size, noticed by Johannes

    Reported-by: Stanislaw Gruszka
    Tested-by: Stanislaw Gruszka
    Signed-off-by: Yinghai Lu
    Acked-by: Johannes Weiner
    Cc: David Miller
    Cc: Benjamin Herrenschmidt
    Cc: Linus Torvalds
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Yinghai Lu
     

26 Feb, 2010

1 commit

  • This patch changes the 32-bit version of kernel_physical_mapping_init() to
    return the last mapped address like the 64-bit one so that we can unify the
    call-site in init_memory_mapping().

    Cc: Yinghai Lu
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Pekka Enberg
    LKML-Reference:
    Signed-off-by: H. Peter Anvin

    Pekka Enberg
     

17 Nov, 2009

2 commits

  • It is possible for x86_64 systems to lack the NX bit either due to the
    hardware lacking support or the BIOS having turned off the CPU capability,
    so NX status should be reported. Additionally, anyone booting NX-capable
    CPUs in 32bit mode without PAE will lack NX functionality, so this change
    provides feedback for that case as well.

    Signed-off-by: Kees Cook
    Signed-off-by: H. Peter Anvin
    LKML-Reference:

    Kees Cook
     
  • The 32- and 64-bit code used very different mechanisms for enabling
    NX, but even the 32-bit code was enabling NX in head_32.S if it is
    available. Furthermore, we had a bewildering collection of tests for
    the available of NX.

    This patch:

    a) merges the 32-bit set_nx() and the 64-bit check_efer() function
    into a single x86_configure_nx() function. EFER control is left
    to the head code.

    b) eliminates the nx_enabled variable entirely. Things that need to
    test for NX enablement can verify __supported_pte_mask directly,
    and cpu_has_nx gives the supported status of NX.

    Signed-off-by: H. Peter Anvin
    Cc: Tejun Heo
    Cc: Brian Gerst
    Cc: Yinghai Lu
    Cc: Pekka Enberg
    Cc: Vegard Nossum
    Cc: Jeremy Fitzhardinge
    Cc: Chris Wright
    LKML-Reference:
    Acked-by: Kees Cook

    H. Peter Anvin
     

22 Sep, 2009

1 commit


01 Jul, 2009

1 commit

  • This sparse warning:

    arch/x86/mm/init.c:83:16: warning: symbol 'check_efer' was not declared. Should it be static?

    triggers because check_efer() is not decalared before using it.
    asm/proto.h includes the declaration of check_efer(), so
    including asm/proto.h to fix that - this also addresses the
    sparse warning.

    Signed-off-by: Jaswinder Singh Rajput
    Cc: Andrew Morton
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Jaswinder Singh Rajput
     

23 Jun, 2009

1 commit

  • The init_gbpages() function is conditionally called from
    init_memory_mapping() function. There are two call-sites where
    this 'after_bootmem' condition can be true: setup_arch() and
    mem_init() via pci_iommu_alloc().

    Therefore, it's safe to move the call to init_gbpages() to
    setup_arch() as it's always called before mem_init().

    This removes an after_bootmem use - paving the way to remove
    all uses of that state variable.

    Signed-off-by: Pekka Enberg
    Acked-by: Yinghai Lu
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Pekka J Enberg
     

15 Jun, 2009

1 commit

  • The hooks that we modify are:
    - Page fault handler (to handle kmemcheck faults)
    - Debug exception handler (to hide pages after single-stepping
    the instruction that caused the page fault)

    Also redefine memset() to use the optimized version if kmemcheck is
    enabled.

    (Thanks to Pekka Enberg for minimizing the impact on the page fault
    handler.)

    As kmemcheck doesn't handle MMX/SSE instructions (yet), we also disable
    the optimized xor code, and rely instead on the generic C implementation
    in order to avoid false-positive warnings.

    Signed-off-by: Vegard Nossum

    [whitespace fixlet]
    Signed-off-by: Pekka Enberg
    Signed-off-by: Ingo Molnar

    [rebased for mainline inclusion]
    Signed-off-by: Vegard Nossum

    Vegard Nossum
     

11 Jun, 2009

2 commits

  • * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits)
    x86: fix system without memory on node0
    x86, mm: Fix node_possible_map logic
    mm, x86: remove MEMORY_HOTPLUG_RESERVE related code
    x86: make sparse mem work in non-NUMA mode
    x86: process.c, remove useless headers
    x86: merge process.c a bit
    x86: use sparse_memory_present_with_active_regions() on UMA
    x86: unify 64-bit UMA and NUMA paging_init()
    x86: Allow 1MB of slack between the e820 map and SRAT, not 4GB
    x86: Sanity check the e820 against the SRAT table using e820 map only
    x86: clean up and and print out initial max_pfn_mapped
    x86/pci: remove rounding quirk from e820_setup_gap()
    x86, e820, pci: reserve extra free space near end of RAM
    x86: fix typo in address space documentation
    x86: 46 bit physical address support on 64 bits
    x86, mm: fault.c, use printk_once() in is_errata93()
    x86: move per-cpu mmu_gathers to mm/init.c
    x86: move max_pfn_mapped and max_low_pfn_mapped to setup.c
    x86: unify noexec handling
    x86: remove (null) in /sys kernel_page_tables
    ...

    Linus Torvalds
     
  • …el/git/tip/linux-2.6-tip

    * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, nmi: Use predefined numbers instead of hardcoded one
    x86: asm/processor.h: remove double declaration
    x86, mtrr: replace MTRRdefType_MSR with msr-index's MSR_MTRRdefType
    x86, mtrr: replace MTRRfix4K_C0000_MSR with msr-index's MSR_MTRRfix4K_C0000
    x86, mtrr: remove mtrr MSRs double declaration
    x86, mtrr: replace MTRRfix16K_80000_MSR with msr-index's MSR_MTRRfix16K_80000
    x86, mtrr: replace MTRRfix64K_00000_MSR with msr-index's MSR_MTRRfix64K_00000
    x86, mtrr: replace MTRRcap_MSR with msr-index's MSR_MTRRcap
    x86: mce: remove duplicated #include
    x86: msr-index.h remove duplicate MSR C001_0015 declaration
    x86: clean up arch/x86/kernel/tsc_sync.c a bit
    x86: use symbolic name for VM86_SIGNAL when used as vm86 default return
    x86: added 'ifndef _ASM_X86_IOMAP_H' to iomap.h
    x86: avoid multiple declaration of kstack_depth_to_print
    x86: vdso/vma.c declare vdso_enabled and arch_setup_additional_pages before they get used
    x86: clean up declarations and variables
    x86: apic/x2apic_cluster.c x86_cpu_to_logical_apicid should be static
    x86 early quirks: eliminate unused function

    Linus Torvalds
     

11 May, 2009

2 commits


08 May, 2009

1 commit

  • With the introduction of the .brk section, special care must be taken
    that no unused page table entries remain if _brk_end and _end are
    separated by a 2M page boundary. cleanup_highmap() runs very early and
    hence cannot take care of that, hence potential entries needing to be
    removed past _brk_end must be cleared once the brk allocator has done
    its job.

    [ Impact: avoids undesirable TLB aliases ]

    Signed-off-by: Jan Beulich
    Signed-off-by: H. Peter Anvin

    Jan Beulich
     

30 Apr, 2009

1 commit


21 Apr, 2009

1 commit


12 Apr, 2009

1 commit

  • Impact: cleanup, no code changed

    - syscalls.h update declarations due to unifications
    - irq.c declare smp_generic_interrupt() before it gets used
    - process.c declare sys_fork() and sys_vfork() before they get used
    - tsc.c rename tsc_khz shadowed variable
    - apic/probe_32.c declare apic_default before it gets used
    - apic/nmi.c prev_nmi_count should be unsigned
    - apic/io_apic.c declare smp_irq_move_cleanup_interrupt() before it gets used
    - mm/init.c declare direct_gbpages and free_initrd_mem before they get used

    Signed-off-by: Jaswinder Singh Rajput
    Signed-off-by: Ingo Molnar

    Jaswinder Singh Rajput
     

13 Mar, 2009

1 commit


06 Mar, 2009

1 commit


05 Mar, 2009

4 commits