Eric Lee / smarc-fsl-linux-kernel

12 Oct, 2016

1 commit

9099daed9 mm: kmemleak: avoid using __va() on addresses that don't have a lowmem mapping ... Browse Code »

Some of the kmemleak_*() callbacks in memblock, bootmem, CMA convert a
physical address to a virtual one using __va(). However, such physical
addresses may sometimes be located in highmem and using __va() is
incorrect, leading to inconsistent object tracking in kmemleak.

The following functions have been added to the kmemleak API and they take
a physical address as the object pointer. They only perform the
corresponding action if the address has a lowmem mapping:

kmemleak_alloc_phys
kmemleak_free_part_phys
kmemleak_not_leak_phys
kmemleak_ignore_phys

The affected calling places have been updated to use the new kmemleak
API.

Link: http://lkml.kernel.org/r/1471531432-16503-1-git-send-email-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas
Reported-by: Vignesh R
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Catalin Marinas
2016-10-12 06:06:33 +0800

08 Oct, 2016

1 commit

8907de5dc mm/memblock.c: expose total reserved memory ... Browse Code »

The total reserved memory in a system is accounted but not available for
use use outside mm/memblock.c. By exposing the total reserved memory,
systems can better calculate the size of large hashes.

Link: http://lkml.kernel.org/r/1472476010-4709-3-git-send-email-srikar@linux.vnet.ibm.com
Signed-off-by: Srikar Dronamraju
Suggested-by: Mel Gorman
Cc: Vlastimil Babka
Cc: Michal Hocko
Cc: Michael Ellerman
Cc: Mahesh Salgaonkar
Cc: Hari Bathini
Cc: Dave Hansen
Cc: Balbir Singh
Cc: Benjamin Herrenschmidt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Srikar Dronamraju
2016-10-08 09:46:28 +0800

05 Aug, 2016

2 commits

e47608ab6 mm/memblock.c: fix NULL dereference error ... Browse Code »

It causes NULL dereference error and failure to get type_a->regions[0]
info if parameter type_b of __next_mem_range_rev() == NULL

Fix this by checking before dereferring and initializing idx_b to 0

The approach is tested by dumping all types of region via
__memblock_dump_all() and __next_mem_range_rev() fixed to UART
separately the result is okay after checking the logs.

Link: http://lkml.kernel.org/r/57A0320D.6070102@zoho.com
Signed-off-by: zijun_hu
Tested-by: zijun_hu
Acked-by: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

zijun_hu
2016-08-05 08:02:09 +0800
412d0008d mm/memblock: fix a typo in a comment ... Browse Code »

s/accomodate/accommodate/

Link: http://lkml.kernel.org/r/20160804121824.18100-1-kuleshovmail@gmail.com
Signed-off-by: Alexander Kuleshov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexander Kuleshov
2016-08-05 08:02:09 +0800

29 Jul, 2016

3 commits

fb399b485 mm/memblock.c: fix index adjustment error in __next_mem_range_rev() ... Browse Code »

Fix region index adjustment error when parameter type_b of
__next_mem_range_rev() == NULL.

Signed-off-by: zijun_hu
Cc: Alexander Kuleshov
Cc: Ard Biesheuvel
Cc: Tang Chen
Cc: Wei Yang
Cc: Tang Chen
Cc: Richard Leitner
Cc: David Gibson
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

zijun_hu
2016-07-29 07:07:41 +0800
a571d4eb5 mm/memblock.c: add new infrastructure to address the mem limit issue ... Browse Code »

In some cases, memblock is queried by kernel to determine whether a
specified address is RAM or not. For example, the ACPI core needs this
information to determine which attributes to use when mapping ACPI
regions(acpi_os_ioremap). Use of incorrect memory types can result in
faults, data corruption, or other issues.

Removing memory with memblock_enforce_memory_limit() throws away this
information, and so a kernel booted with 'mem=' may suffer from the
issues described above. To avoid this, we need to keep those NOMAP
regions instead of removing all above the limit, which preserves the
information we need while preventing other use of those regions.

This patch adds new infrastructure to retain all NOMAP memblock regions
while removing others, to cater for this.

Link: http://lkml.kernel.org/r/1468475036-5852-2-git-send-email-dennis.chen@arm.com
Signed-off-by: Dennis Chen
Acked-by: Steve Capper
Cc: Catalin Marinas
Cc: Ard Biesheuvel
Cc: Pekka Enberg
Cc: Mel Gorman
Cc: Tang Chen
Cc: Tony Luck
Cc: Ingo Molnar
Cc: Rafael J. Wysocki
Cc: Will Deacon
Cc: Mark Rutland
Cc: Matt Fleming
Cc: Kaly Xin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dennis Chen
2016-07-29 07:07:41 +0800
c4c5ad6b3 memblock: include <asm/sections.h> instead of <asm-generic/sections.h> ... Browse Code »

asm-generic headers are generic implementations for architecture
specific code and should not be included by common code. Thus use the
asm/ version of sections.h to get at the linker sections.

Link: http://lkml.kernel.org/r/1468285103-7470-1-git-send-email-hch@lst.de
Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Hellwig
2016-07-29 07:07:41 +0800

27 Jul, 2016

1 commit

ef3cc4db4 mm/memblock.c:memblock_add_range(): if nr_new is 0 just return ... Browse Code »

If nr_new is 0 which means there's no region would be added, so just
return to the caller.

Signed-off-by: nimisolo
Cc: Alexander Kuleshov
Cc: Pekka Enberg
Cc: Tony Luck
Cc: Mel Gorman
Cc: Tang Chen
Cc: Wei Yang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

nimisolo
2016-07-27 07:19:19 +0800

21 May, 2016

2 commits

cd33a76b0 mm/memblock.c: remove unnecessary always-true comparison ... Browse Code »

Comparing an u64 variable to >= 0 returns always true and can therefore
be removed. This issue was detected using the -Wtype-limits gcc flag.

This patch fixes following type-limits warning:

mm/memblock.c: In function `__next_reserved_mem_region':
mm/memblock.c:843:11: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
if (*idx >= 0 && *idx < type->cnt) {

Link: http://lkml.kernel.org/r/20160510103625.3a7f8f32@g0hl1n.net
Signed-off-by: Richard Leitner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Richard Leitner
2016-05-21 08:58:30 +0800
f705ac4b3 mm/memblock.c: move memblock_{add,reserve}_region into memblock_{add,reserve} ... Browse Code »

memblock_add_region() and memblock_reserve_region() do nothing specific
before the call of memblock_add_range(), only print debug output.

We can do the same in memblock_add() and memblock_reserve() since both
memblock_add_region() and memblock_reserve_region() are not used by
anybody outside of memblock.c and memblock_{add,reserve}() have the same
set of flags and nids.

Since memblock_add_region() and memblock_reserve_region() will be
inlined, there will not be functional changes, but will improve code
readability a little.

Signed-off-by: Alexander Kuleshov
Acked-by: Ard Biesheuvel
Cc: Mel Gorman
Cc: Pekka Enberg
Cc: Tony Luck
Cc: Tang Chen
Cc: David Gibson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexander Kuleshov
2016-05-21 08:58:30 +0800

18 Mar, 2016

1 commit

756a025f0 mm: coalesce split strings ... Browse Code »

Kernel style prefers a single string over split strings when the string is
'user-visible'.

Miscellanea:

- Add a missing newline
- Realign arguments

Signed-off-by: Joe Perches
Acked-by: Tejun Heo [percpu]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2016-03-18 06:09:34 +0800

16 Mar, 2016

1 commit

5aa174801 mm/memblock.c: remove unnecessary memblock_type variable ... Browse Code »

We define struct memblock_type *type in the memblock_add_region() and
memblock_reserve_region() functions only for passing it to the
memlock_add_range() and memblock_reserve_range() functions. Let's
remove these variables and will pass a type directly.

Signed-off-by: Alexander Kuleshov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexander Kuleshov
2016-03-16 07:55:16 +0800

06 Feb, 2016

1 commit

1f1ffb8a1 memblock: don't mark memblock_phys_mem_size() as __init ... Browse Code »

At the moment memblock_phys_mem_size() is marked as __init, and so is
discarded after boot. This is different from most of the memblock
functions which are marked __init_memblock, and are only discarded after
boot if memory hotplug is not configured.

To allow for upcoming code which will need memblock_phys_mem_size() in
the hotplug path, change it from __init to __init_memblock.

Signed-off-by: David Gibson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Gibson
2016-02-06 10:10:40 +0800

15 Jan, 2016

3 commits

8c9c1701c mm/memblock: introduce for_each_memblock_type() ... Browse Code »

We already have the for_each_memblock() macro in
which provides ability to iterate over memblock regions of a known type.
The for_each_memblock() macro allows us to pass the pointer to the
struct memblock_type, instead we need to pass name of the type.

This patch introduces a new macro for_each_memblock_type() which allows
us iterate over memblock regions with the given type when the type is
unknown.

Signed-off-by: Alexander Kuleshov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexander Kuleshov
2016-01-15 08:00:49 +0800
f14516fbf mm/memblock: remove rgnbase and rgnsize variables ... Browse Code »

Remove rgnbase and rgnsize variables from memblock_overlaps_region().
We use these variables only for passing to the memblock_addrs_overlap()
function and that's all. Let's remove them.

Signed-off-by: Alexander Kuleshov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexander Kuleshov
2016-01-15 08:00:49 +0800
b4ad0c7e0 mm/memblock.c: memblock_is_memory()/reserved() can be boolean ... Browse Code »

Make memblock_is_memory() and memblock_is_reserved return bool to
improve readability due to these particular functions only using either
one or zero as their return value.

No functional change.

Signed-off-by: Yaowei Bai
Acked-by: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Yaowei Bai
2016-01-15 08:00:49 +0800

10 Dec, 2015

1 commit

bf3d3cc58 mm/memblock: add MEMBLOCK_NOMAP attribute to memblock memory table ... Browse Code »

This introduces the MEMBLOCK_NOMAP attribute and the required plumbing
to make it usable as an indicator that some parts of normal memory
should not be covered by the kernel direct mapping. It is up to the
arch to actually honor the attribute when laying out this mapping,
but the memblock code itself is modified to disregard these regions
for allocations and other general use.

Cc: linux-mm@kvack.org
Cc: Alexander Kuleshov
Cc: Andrew Morton
Reviewed-by: Matt Fleming
Signed-off-by: Ard Biesheuvel
Signed-off-by: Will Deacon

Ard Biesheuvel
2015-12-10 00:56:58 +0800

06 Nov, 2015

1 commit

35bd16a22 mm/memblock: make memblock_remove_range() static ... Browse Code »

memblock_remove_range() is only used in the mm/memblock.c, so we can make
it static.

Signed-off-by: Alexander Kuleshov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexander Kuleshov
2015-11-06 11:34:48 +0800

09 Sep, 2015

6 commits

ad5ea8cd5 mm/memblock.c: fix comment in __next_mem_range() ... Browse Code »

Signed-off-by: Alexander Kuleshov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexander Kuleshov
2015-09-09 06:35:28 +0800
c11539315 mm/memblock.c: fiy typos in comments ... Browse Code »

s/succees/success/

Signed-off-by: Alexander Kuleshov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexander Kuleshov
2015-09-09 06:35:28 +0800
567d117b8 mm/memblock.c: rename local variable of memblock_type to 'type' ... Browse Code »

Since commit e3239ff92a17 ("memblock: Rename memblock_region to
memblock_type and memblock_property to memblock_region"), all local
variables of the membock_type type were renamed to 'type'. This commit
renames all remaining local variables with the memblock_type type to the
same view.

Signed-off-by: Alexander Kuleshov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexander Kuleshov
2015-09-09 06:35:28 +0800
95cf82ecc mem-hotplug: handle node hole when initializing numa_meminfo. ... Browse Code »

When parsing SRAT, all memory ranges are added into numa_meminfo. In
numa_init(), before entering numa_cleanup_meminfo(), all possible memory
ranges are in numa_meminfo. And numa_cleanup_meminfo() removes all
ranges over max_pfn or empty.

But, this only works if the nodes are continuous. Let's have a look at
the following example:

We have an SRAT like this:
SRAT: Node 0 PXM 0 [mem 0x00000000-0x5fffffff]
SRAT: Node 0 PXM 0 [mem 0x100000000-0x1ffffffffff]
SRAT: Node 1 PXM 1 [mem 0x20000000000-0x3ffffffffff]
SRAT: Node 4 PXM 2 [mem 0x40000000000-0x5ffffffffff] hotplug
SRAT: Node 5 PXM 3 [mem 0x60000000000-0x7ffffffffff] hotplug
SRAT: Node 2 PXM 4 [mem 0x80000000000-0x9ffffffffff] hotplug
SRAT: Node 3 PXM 5 [mem 0xa0000000000-0xbffffffffff] hotplug
SRAT: Node 6 PXM 6 [mem 0xc0000000000-0xdffffffffff] hotplug
SRAT: Node 7 PXM 7 [mem 0xe0000000000-0xfffffffffff] hotplug

On boot, only node 0,1,2,3 exist.

And the numa_meminfo will look like this:
numa_meminfo.nr_blks = 9
1. on node 0: [0, 60000000]
2. on node 0: [100000000, 20000000000]
3. on node 1: [20000000000, 40000000000]
4. on node 4: [40000000000, 60000000000]
5. on node 5: [60000000000, 80000000000]
6. on node 2: [80000000000, a0000000000]
7. on node 3: [a0000000000, a0800000000]
8. on node 6: [c0000000000, a0800000000]
9. on node 7: [e0000000000, a0800000000]

And numa_cleanup_meminfo() will merge 1 and 2, and remove 8,9 because the
end address is over max_pfn, which is a0800000000. But 4 and 5 are not
removed because their end addresses are less then max_pfn. But in fact,
node 4 and 5 don't exist.

In a word, numa_cleanup_meminfo() is not able to handle holes between nodes.

Since memory ranges in node 4 and 5 are in numa_meminfo, in
numa_register_memblks(), node 4 and 5 will be mistakenly set to online.

If you run lscpu, it will show:
NUMA node0 CPU(s): 0-14,128-142
NUMA node1 CPU(s): 15-29,143-157
NUMA node2 CPU(s):
NUMA node3 CPU(s):
NUMA node4 CPU(s): 62-76,190-204
NUMA node5 CPU(s): 78-92,206-220

In this patch, we use memblock_overlaps_region() to check if ranges in
numa_meminfo overlap with ranges in memory_block. Since memory_block
contains all available memory at boot time, if they overlap, it means the
ranges exist. If not, then remove them from numa_meminfo.

After this patch, lscpu will show:
NUMA node0 CPU(s): 0-14,128-142
NUMA node1 CPU(s): 15-29,143-157
NUMA node4 CPU(s): 62-76,190-204
NUMA node5 CPU(s): 78-92,206-220

Signed-off-by: Tang Chen
Reviewed-by: Yasuaki Ishimatsu
Cc: Thomas Gleixner
Cc: Tejun Heo
Cc: Luiz Capitulino
Cc: Xishi Qiu
Cc: Will Deacon
Cc: Vladimir Murzin
Cc: Fabian Frederick
Cc: Alexander Kuleshov
Cc: Baoquan He
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tang Chen
2015-09-09 06:35:28 +0800
c5c5c9d10 mm/memblock.c: make memblock_overlaps_region() return bool. ... Browse Code »

memblock_overlaps_region() checks if the given memblock region
intersects a region in memblock. If so, it returns the index of the
intersected region.

But its only caller is memblock_is_region_reserved(), and it returns 0
if false, non-zero if true.

Both of these should return bool.

Signed-off-by: Tang Chen
Cc: Thomas Gleixner
Cc: Tejun Heo
Cc: Yasuaki Ishimatsu
Cc: Luiz Capitulino
Cc: Xishi Qiu
Cc: Will Deacon
Cc: Vladimir Murzin
Cc: Fabian Frederick
Cc: Alexander Kuleshov
Cc: Baoquan He
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tang Chen
2015-09-09 06:35:28 +0800
4fcab5f43 mm/memblock.c: WARN_ON when flags differs from overlap region ... Browse Code »

Each memblock_region has flags to indicates the type of this range. For
the overlap case, memblock_add_range() inserts the lower part and leave the
upper part as indicated in the overlapped region.

If the flags of the new range differs from the overlapped region, the
information recorded is not correct.

This patch adds a WARN_ON when the flags of the new range differs from the
overlapped region.

Signed-off-by: Wei Yang
Cc: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Wei Yang
2015-09-09 06:35:28 +0800

05 Sep, 2015

1 commit

c0a294988 mm/memblock: WARN_ON when nid differs from overlap region ... Browse Code »

Each memblock_region has nid to indicates the Node ID of this range. For
the overlap case, memblock_add_range() inserts the lower part and leave
the upper part as indicated in the overlapped region.

If the nid of the new range differs from the overlapped region, the
information recorded is not correct.

This patch adds a WARN_ON when the nid of the new range differs from the
overlapped region.

Signed-off-by: Wei Yang
Acked-by: David Rientjes
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Wei Yang
2015-09-05 07:54:41 +0800

01 Jul, 2015

2 commits

d70ddd7a5 mm: page_alloc: pass PFN to __free_pages_bootmem ... Browse Code »

__free_pages_bootmem prepares a page for release to the buddy allocator
and assumes that the struct page is initialised. Parallel initialisation
of struct pages defers initialisation and __free_pages_bootmem can be
called for struct pages that cannot yet map struct page to PFN. This
patch passes PFN to __free_pages_bootmem with no other functional change.

Signed-off-by: Mel Gorman
Tested-by: Nate Zimmer
Tested-by: Waiman Long
Tested-by: Daniel J Blueman
Acked-by: Pekka Enberg
Cc: Robin Holt
Cc: Nate Zimmer
Cc: Dave Hansen
Cc: Waiman Long
Cc: Scott Norton
Cc: "Luck, Tony"
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2015-07-01 10:44:55 +0800
8e7a7f861 memblock: introduce a for_each_reserved_mem_region iterator ... Browse Code »

Struct page initialisation had been identified as one of the reasons why
large machines take a long time to boot. Patches were posted a long time ago
to defer initialisation until they were first used. This was rejected on
the grounds it should not be necessary to hurt the fast paths. This series
reuses much of the work from that time but defers the initialisation of
memory to kswapd so that one thread per node initialises memory local to
that node.

After applying the series and setting the appropriate Kconfig variable I
see this in the boot log on a 64G machine

[ 7.383764] kswapd 0 initialised deferred memory in 188ms
[ 7.404253] kswapd 1 initialised deferred memory in 208ms
[ 7.411044] kswapd 3 initialised deferred memory in 216ms
[ 7.411551] kswapd 2 initialised deferred memory in 216ms

On a 1TB machine, I see

[ 8.406511] kswapd 3 initialised deferred memory in 1116ms
[ 8.428518] kswapd 1 initialised deferred memory in 1140ms
[ 8.435977] kswapd 0 initialised deferred memory in 1148ms
[ 8.437416] kswapd 2 initialised deferred memory in 1148ms

Once booted the machine appears to work as normal. Boot times were measured
from the time shutdown was called until ssh was available again. In the
64G case, the boot time savings are negligible. On the 1TB machine, the
savings were 16 seconds.

Nate Zimmer said:

: On an older 8 TB box with lots and lots of cpus the boot time, as
: measure from grub to login prompt, the boot time improved from 1484
: seconds to exactly 1000 seconds.

Waiman Long said:

: I ran a bootup timing test on a 12-TB 16-socket IvyBridge-EX system. From
: grub menu to ssh login, the bootup time was 453s before the patch and 265s
: after the patch - a saving of 188s (42%).

Daniel Blueman said:

: On a 7TB, 1728-core NumaConnect system with 108 NUMA nodes, we're seeing
: stock 4.0 boot in 7136s. This drops to 2159s, or a 70% reduction with
: this patchset. Non-temporal PMD init (https://lkml.org/lkml/2015/4/23/350)
: drops this to 1045s.

This patch (of 13):

As part of initializing struct page's in 2MiB chunks, we noticed that at
the end of free_all_bootmem(), there was nothing which had forced the
reserved/allocated 4KiB pages to be initialized.

This helper function will be used for that expansion.

Signed-off-by: Robin Holt
Signed-off-by: Nate Zimmer
Signed-off-by: Mel Gorman
Tested-by: Nate Zimmer
Tested-by: Waiman Long
Tested-by: Daniel J Blueman
Acked-by: Pekka Enberg
Cc: Robin Holt
Cc: Dave Hansen
Cc: Waiman Long
Cc: Scott Norton
Cc: "Luck, Tony"
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Robin Holt
2015-07-01 10:44:55 +0800

25 Jun, 2015

2 commits

a3f5bafcc mm/memblock: allocate boot time data structures from mirrored memory ... Browse Code »

Try to allocate all boot time kernel data structures from mirrored
memory.

If we run out of mirrored memory print warnings, but fall back to using
non-mirrored memory to make sure that we still boot.

By number of bytes, most of what we allocate at boot time is the page
structures. 64 bytes per 4K page on x86_64 ... or about 1.5% of total
system memory. For workloads where the bulk of memory is allocated to
applications this may represent a useful improvement to system
availability since 1.5% of total memory might be a third of the memory
allocated to the kernel.

Signed-off-by: Tony Luck
Cc: Xishi Qiu
Cc: Hanjun Guo
Cc: Xiexiuqi
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Yinghai Lu
Cc: Naoya Horiguchi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tony Luck
2015-06-25 08:49:45 +0800
fc6daaf93 mm/memblock: add extra "flags" to memblock to allow selection of memory based on attribute ... Browse Code »

Some high end Intel Xeon systems report uncorrectable memory errors as a
recoverable machine check. Linux has included code for some time to
process these and just signal the affected processes (or even recover
completely if the error was in a read only page that can be replaced by
reading from disk).

But we have no recovery path for errors encountered during kernel code
execution. Except for some very specific cases were are unlikely to ever
be able to recover.

Enter memory mirroring. Actually 3rd generation of memory mirroing.

Gen1: All memory is mirrored
Pro: No s/w enabling - h/w just gets good data from other side of the
mirror
Con: Halves effective memory capacity available to OS/applications

Gen2: Partial memory mirror - just mirror memory begind some memory controllers
Pro: Keep more of the capacity
Con: Nightmare to enable. Have to choose between allocating from
mirrored memory for safety vs. NUMA local memory for performance

Gen3: Address range partial memory mirror - some mirror on each memory
controller
Pro: Can tune the amount of mirror and keep NUMA performance
Con: I have to write memory management code to implement

The current plan is just to use mirrored memory for kernel allocations.
This has been broken into two phases:

1) This patch series - find the mirrored memory, use it for boot time
allocations

2) Wade into mm/page_alloc.c and define a ZONE_MIRROR to pick up the
unused mirrored memory from mm/memblock.c and only give it out to
select kernel allocations (this is still being scoped because
page_alloc.c is scary).

This patch (of 3):

Add extra "flags" to memblock to allow selection of memory based on
attribute. No functional changes

Signed-off-by: Tony Luck
Cc: Xishi Qiu
Cc: Hanjun Guo
Cc: Xiexiuqi
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Yinghai Lu
Cc: Naoya Horiguchi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tony Luck
2015-06-25 08:49:44 +0800

16 Apr, 2015

1 commit

6a4055bc7 mm/memblock.c: add debug output for memblock_add() ... Browse Code »

memblock_reserve() calls memblock_reserve_region() which prints debugging
information if 'memblock=debug' was passed on the command line. This
patch adds the same behaviour, but for memblock_add function().

[akpm@linux-foundation.org: s/memblock_memory/memblock_add/ in message]
Signed-off-by: Alexander Kuleshov
Cc: Martin Schwidefsky
Cc: Philipp Hachtmann
Cc: Fabian Frederick
Cc: Catalin Marinas
Cc: Emil Medve
Cc: Akinobu Mita
Cc: Tang Chen
Cc: Tony Luck
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexander Kuleshov
2015-04-16 07:35:19 +0800

15 Apr, 2015

1 commit

7fc825b45 mm/memblock.c: rename local variable of memblock_type to `type' ... Browse Code »

A small cleanup. Seems in e3239ff9 ("memblock: Rename memblock_region to
memblock_type and memblock_property to memblock_region") this one was
missed.

Signed-off-by: Baoquan He
Cc: Benjamin Herrenschmidt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Baoquan He
2015-04-15 07:49:00 +0800

14 Dec, 2014

1 commit

4308ce17f mm/memblock.c: refactor functions to set/clear MEMBLOCK_HOTPLUG ... Browse Code »

There is a lot of duplication in the rubric around actually setting or
clearing a mem region flag. Create a new helper function to do this and
reduce each of memblock_mark_hotplug() and memblock_clear_hotplug() to a
single line.

This will be useful if someone were to add a new mem region flag - which
I hope to be doing some day soon. But it looks like a plausible cleanup
even without that - so I'd like to get it out of the way now.

Signed-off-by: Tony Luck
Cc: Santosh Shilimkar
Cc: Tang Chen
Cc: Grygorii Strashko
Cc: Zhang Yanfei
Cc: Philipp Hachtmann
Cc: Yinghai Lu
Cc: Emil Medve
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tony Luck
2014-12-14 04:42:46 +0800

11 Sep, 2014

1 commit

0a313a998 mem-hotplug: let memblock skip the hotpluggable memory regions in __next_mem_range() ... Browse Code »

Let memblock skip the hotpluggable memory regions in __next_mem_range(),
it is used to to prevent memblock from allocating hotpluggable memory
for the kernel at early time. The code is the same as __next_mem_range_rev().

Clear hotpluggable flag before releasing free pages to the buddy
allocator. If we don't clear hotpluggable flag in
free_low_memory_core_early(), the memory which marked hotpluggable flag
will not free to buddy allocator. Because __next_mem_range() will skip
them.

free_low_memory_core_early
for_each_free_mem_range
for_each_mem_range
__next_mem_range

[akpm@linux-foundation.org: fix warning]
Signed-off-by: Xishi Qiu
Cc: Tejun Heo
Cc: Tang Chen
Cc: Zhang Yanfei
Cc: Wen Congyang
Cc: "Rafael J. Wysocki"
Cc: "H. Peter Anvin"
Cc: Wu Fengguang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Xishi Qiu
2014-09-11 06:42:12 +0800

30 Aug, 2014

1 commit

0cfb8f0c3 memblock, memhotplug: fix wrong type in memblock_find_in_range_node(). ... Browse Code »

In memblock_find_in_range_node(), we defined ret as int. But it should
be phys_addr_t because it is used to store the return value from
__memblock_find_range_bottom_up().

The bug has not been triggered because when allocating low memory near
the kernel end, the "int ret" won't turn out to be negative. When we
started to allocate memory on other nodes, and the "int ret" could be
minus. Then the kernel will panic.

A simple way to reproduce this: comment out the following code in
numa_init(),

memblock_set_bottom_up(false);

and the kernel won't boot.

Reported-by: Xishi Qiu
Signed-off-by: Tang Chen
Tested-by: Xishi Qiu
Cc: [3.13+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tang Chen
2014-08-30 07:28:15 +0800

07 Jun, 2014

1 commit

aedf95ea0 mm/memblock.c: call kmemleak directly from memblock_(alloc|free) ... Browse Code »

Kmemleak could ignore memory blocks allocated via memblock_alloc()
leading to false positives during scanning. This patch adds the
corresponding callbacks and removes kmemleak_free_* calls in
mm/nobootmem.c to avoid duplication.

The kmemleak_alloc() in mm/nobootmem.c is kept since
__alloc_memory_core_early() does not use memblock_alloc() directly.

Signed-off-by: Catalin Marinas
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Catalin Marinas
2014-06-07 07:08:17 +0800

05 Jun, 2014

2 commits

f7e2f7e89 mm/memblock.c: use PFN_DOWN ... Browse Code »

Replace ((x) >> PAGE_SHIFT) with the pfn macro.

Signed-off-by: Fabian Frederick
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-06-05 07:54:02 +0800
2bfc2862c memblock: introduce memblock_alloc_range() ... Browse Code »

This introduces memblock_alloc_range() which allocates memblock from the
specified range of physical address. I would like to use this function
to specify the location of CMA.

Signed-off-by: Akinobu Mita
Cc: Marek Szyprowski
Cc: Konrad Rzeszutek Wilk
Cc: David Woodhouse
Cc: Don Dutile
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Andi Kleen
Cc: Yinghai Lu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Akinobu Mita
2014-06-05 07:53:57 +0800

20 May, 2014

2 commits

70210ed95 mm/memblock: add physical memory list ... Browse Code »

Add the physmem list to the memblock structure. This list only exists
if HAVE_MEMBLOCK_PHYS_MAP is selected and contains the unmodified
list of physically available memory. It differs from the memblock
memory list as it always contains all memory ranges even if the
memory has been restricted, e.g. by use of the mem= kernel parameter.

Signed-off-by: Philipp Hachtmann
Signed-off-by: Martin Schwidefsky

Philipp Hachtmann
2014-05-20 14:58:39 +0800
f1af9d3af mm/memblock: Do some refactoring, enhance API ... Browse Code »

Refactor the memblock code and extend the memblock API to make it
more flexible. With the extended API it is simple to define and
work with additional memory lists.

The static functions memblock_add_region and __memblock_remove are
renamed to memblock_add_range and meblock_remove_range and added to
the memblock API.

The __next_free_mem_range and __next_free_mem_range_rev functions
are replaced with calls to the more generic list walkers
__next_mem_range and __next_mem_range_rev.

To walk an arbitrary memory list two new macros for_each_mem_range
and for_each_mem_range_rev are added. These new macros are used
to define for_each_free_mem_range and for_each_free_mem_range_reverse.

Signed-off-by: Philipp Hachtmann
Signed-off-by: Martin Schwidefsky

Philipp Hachtmann
2014-05-20 14:58:39 +0800

08 Apr, 2014

1 commit

167632303 mm/memblock.c: use PFN_PHYS() ... Browse Code »

Replace ((phys_addr_t)(x) << PAGE_SHIFT) by pfn macro.

Signed-off-by: Fabian Frederick
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-04-08 07:35:58 +0800