30 May, 2012

1 commit

  • Print physical address info in a style consistent with the %pR style used
    elsewhere in the kernel. For example:

    -found SMP MP-table at [ffff8800000fce90] fce90
    +found SMP MP-table at [mem 0x000fce90-0x000fce9f] mapped at [ffff8800000fce90]
    -initial memory mapped : 0 - 20000000
    +initial memory mapped: [mem 0x00000000-0x1fffffff]
    -Base memory trampoline at [ffff88000009c000] 9c000 size 8192
    +Base memory trampoline [mem 0x0009c000-0x0009dfff] mapped at [ffff88000009c000]
    -SRAT: Node 0 PXM 0 0-80000000
    +SRAT: Node 0 PXM 0 [mem 0x00000000-0x7fffffff]

    Signed-off-by: Bjorn Helgaas
    Cc: Yinghai Lu
    Cc: Konrad Rzeszutek Wilk
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bjorn Helgaas
     

09 May, 2012

1 commit

  • Allows emulating more interesting NUMA configurations like a quad
    socket AMD Magny-Cour:

    "numa=fake=8:10,16,16,22,16,22,16,22,
    16,10,22,16,22,16,22,16,
    16,22,10,16,16,22,16,22,
    22,16,16,10,22,16,22,16,
    16,22,16,22,10,16,16,22,
    22,16,22,16,16,10,22,16,
    16,22,16,22,16,22,10,16,
    22,16,22,16,22,16,16,10"

    Which has a non-fully-connected topology.

    Signed-off-by: Peter Zijlstra
    Cc: Tejun Heo
    Cc: Yinghai Lu
    Cc: x86@kernel.org
    Link: http://lkml.kernel.org/n/tip-e1136ef7kdffj7yf9tjhydln@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

23 Mar, 2012

1 commit


22 Mar, 2012

1 commit

  • Without this fix the cpumask_of_node() for a fake=numa=2 is:

    cpumask 0 ff
    cpumask 1 ff

    with the fix it's correct and it's set to:

    cpumask 0 55
    cpumask 1 aa

    Signed-off-by: Andrea Arcangeli
    Cc: Andi Kleen
    Cc: Johannes Weiner
    Cc: David Rientjes
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     

04 Mar, 2012

1 commit

  • mem_hole_size() is being called only from __init-marked functions, and as
    such should be moved to .init section as well. Fixes this warning:

    WARNING: vmlinux.o(.text+0x35511): Section mismatch in reference from the function mem_hole_size() to the function .init.text:absent_pages_in_range()

    Signed-off-by: Jiri Kosina
    Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1202281614450.31150@pobox.suse.cz
    Signed-off-by: H. Peter Anvin

    Jiri Kosina
     

15 Jul, 2011

2 commits

  • Other than sanity check and debug message, the x86 specific version of
    memblock reserve/free functions are simple wrappers around the generic
    versions - memblock_reserve/free().

    This patch adds debug messages with caller identification to the
    generic versions and replaces x86 specific ones and kills them.
    arch/x86/include/asm/memblock.h and arch/x86/mm/memblock.c are empty
    after this change and removed.

    Signed-off-by: Tejun Heo
    Link: http://lkml.kernel.org/r/1310462166-31469-14-git-send-email-tj@kernel.org
    Cc: Yinghai Lu
    Cc: Benjamin Herrenschmidt
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Signed-off-by: H. Peter Anvin

    Tejun Heo
     
  • memblock_x86_hole_size() calculates the total size of holes in a given
    range according to memblock and is used by numa emulation code and
    numa_meminfo_cover_memory().

    Since conversion to MEMBLOCK_NODE_MAP, absent_pages_in_range() also
    uses memblock and gives the same result. This patch replaces
    memblock_x86_hole_size() uses with absent_pages_in_range(). After the
    conversion the x86 function doesn't have any user left and is killed.

    Signed-off-by: Tejun Heo
    Link: http://lkml.kernel.org/r/1310462166-31469-12-git-send-email-tj@kernel.org
    Cc: Yinghai Lu
    Cc: Benjamin Herrenschmidt
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Signed-off-by: H. Peter Anvin

    Tejun Heo
     

14 Jul, 2011

1 commit

  • 25818f0f28 (memblock: Make MEMBLOCK_ERROR be 0) thankfully made
    MEMBLOCK_ERROR 0 and there already are codes which expect error return
    to be 0. There's no point in keeping MEMBLOCK_ERROR around. End its
    misery.

    Signed-off-by: Tejun Heo
    Link: http://lkml.kernel.org/r/1310457490-3356-6-git-send-email-tj@kernel.org
    Cc: Yinghai Lu
    Cc: Benjamin Herrenschmidt
    Signed-off-by: H. Peter Anvin

    Tejun Heo
     

02 May, 2011

1 commit

  • Now that NUMA init path is unified, NUMA emulation can be enabled on
    32bit. Make numa_emluation.c safe on 32bit by doing the followings.

    * Define MAX_DMA32_PFN on 32bit too.

    * Include bootmem.h for max_pfn declaration.

    * Use u64 explicitly and always use PFN_PHYS() when converting page
    number to address.

    * Avoid __udivdi3() generation on 32bit by doing number of pages
    calculation instead in split_nodes_interleave().

    And drop X86_64 dependency from Kconfig.

    Signed-off-by: Tejun Heo
    Cc: Ingo Molnar
    Cc: Yinghai Lu
    Cc: David Rientjes
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"

    Tejun Heo
     

21 Apr, 2011

1 commit

  • The cpunode mappings under CONFIG_DEBUG_PER_CPU_MAPS=y
    when NUMA emulation is enabled is currently broken because it does
    not iterate through every emulated node and bind cpus that have
    affinity to it.

    NUMA emulation should bind each cpu to every local node to
    accurately represent the true NUMA topology of the underlying
    machine.

    debug_cpumask_set_cpu() needs to be fixed at the same time so
    that the debugging information that it emits shows the new
    cpumask of the node being assigned when the cpu is being added
    or removed.

    It can now take responsibility of setting or clearing the cpu
    itself to remove the need for duplicate code.

    Also change its last parameter, "enable", to have the correct bool
    type since it can only be true or false.

    -v2: Fix the return statements, by Kosaki Motohiro

    Acked-and-Tested-by: KOSAKI Motohiro
    Signed-off-by: David Rientjes
    Cc: Andreas Herrmann
    Cc: Tejun Heo
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/alpine.DEB.2.00.1104201918470.12634@chino.kir.corp.google.com
    Signed-off-by: Ingo Molnar

    David Rientjes
     

12 Mar, 2011

1 commit

  • The distance transforming in numa_emulation() used to call
    numa_set_distance() for all MAX_NUMNODES * MAX_NUMNODES node
    combinations regardless of which are enabled. As numa_set_distance()
    ignores all out-of-bound distance settings, this doesn't cause any
    problem other than looping unnecessarily many times during boot.

    However, as MAX_NUMNODES * MAX_NUMNODES can be pretty high, update the
    code such that it iterates through only the enabled combinations.

    Yinghai Lu identified the issue and provided an initial patch to
    address the issue; however, the patch was incorrect in that it didn't
    build emulated distance table when there's no physical distance table
    and unnecessarily complex.

    http://thread.gmane.org/gmane.linux.kernel/1107986/focus=1107988

    Signed-off-by: Tejun Heo
    Reported-by: Yinghai Lu
    Acked-by: Yinghai Lu

    Tejun Heo
     

04 Mar, 2011

2 commits

  • Undetermined entries in emu_nid_to_phys[] are filled with zero
    assuming that physical node 0 is always online; however, this might
    not be true depending on hardware configuration. Find a physical node
    which is actually online and use it instead.

    Signed-off-by: Tejun Heo
    Reported-by: David Rientjes
    LKML-Reference:

    Tejun Heo
     
  • On one system that does not have RAM on node0.

    When numa_emulation is compiled in, and
    1. boot system without numa=fake...
    2. or boot system with numa=fake=128 to make emulation fail

    will get:

    [ 0.092026] ------------[ cut here ]------------
    [ 0.096005] kernel BUG at arch/x86/mm/numa_emulation.c:439!
    [ 0.096005] invalid opcode: 0000 [#1] SMP
    [ 0.096005] last sysfs file:
    [ 0.096005] CPU 0
    [ 0.096005] Modules linked in:
    [ 0.096005]
    [ 0.096005] Pid: 0, comm: swapper Not tainted 2.6.38-rc6-tip-yh-03869-gcb0491d-dirty #684 Sun Microsystems Sun Fire X4240/Sun Fire X4240
    [ 0.096005] RIP: 0010:[] [] numa_add_cpu+0x56/0xcf
    [ 0.096005] RSP: 0000:ffffffff82437ed8 EFLAGS: 00010246
    ...
    [ 0.096005] Call Trace:
    [ 0.096005] [] identify_cpu+0x2d7/0x2df
    [ 0.096005] [] identify_boot_cpu+0x10/0x30
    [ 0.096005] [] check_bugs+0x9/0x2d
    [ 0.096005] [] start_kernel+0x3d7/0x3f1
    [ 0.096005] [] x86_64_start_reservations+0x9c/0xa0
    [ 0.096005] [] x86_64_start_kernel+0x1dd/0x1e8
    [ 0.096005] Code: 74 06 48 8d 04 90 eb 0f 48 c7 c0 30 d9 00 00 48 03 04 d5 90 0f 60 82 8b 00 83 f8 ff 74 0d 0f a3 05 8b 7e 92 00 19 d2 85 d2 75 02 0b 48 98 be 00 01 00 00 48 c7 c7 e0 44 60 82 44 8b 2c 85 e0
    [ 0.096005] RIP [] numa_add_cpu+0x56/0xcf
    [ 0.096005] RSP
    [ 0.096026] ---[ end trace a7919e7f17c0a725 ]---

    We need to use early_cpu_to_node() directly, because numa_cpu_node()
    will return node0 that is not onlined.

    Signed-off-by: Yinghai Lu
    Signed-off-by: Tejun Heo

    Yinghai Lu
     

02 Mar, 2011

2 commits

  • Handling of out-of-bounds distances and allocation failure can use
    better documentation. Add it.

    Signed-off-by: Tejun Heo
    Cc: Yinghai Lu
    Acked-by: David Rientjes

    Tejun Heo
     
  • NUMA distance table handling has the following problems.

    * numa_reset_distance() uses numa_distance * sizeof(numa_distance[0])
    as the table size when it should be using the square of
    numa_distance.

    * The same size miscalculation when allocation space for phys_dist in
    numa_emulation().

    * In numa_emulation(), phys_dist must be reserved; otherwise, the new
    emulated distance table may overlap it.

    Fix them and, while at it, take numa_distance_cnt resetting in
    numa_reset_distance() out of the if block to simplify the code a bit.

    David Rientjes reported incorrect handling of distance table during
    emulation.

    -tj: Edited out numa_alloc_distance() related changes which weren't
    necessary and rewrote patch description.

    -v2: Ingo was unhappy with 80-column limit induced linebreaks. Let
    lines run over 80-column.

    Signed-off-by: Yinghai Lu
    Reported-by: David Rientjes
    Signed-off-by: Tejun Heo
    Cc: Ingo Molnar
    Acked-by: David Rientjes

    Yinghai Lu
     

22 Feb, 2011

2 commits