08 Apr, 2020
1 commit
-
Patch series "mm: drop superfluous section checks when onlining/offlining".
Let's drop some superfluous section checks on the onlining/offlining path.
This patch (of 3):
Since commit c5e79ef561b0 ("mm/memory_hotplug.c: don't allow to
online/offline memory blocks with holes") we have a generic check in
offline_pages() that disallows offlining memory blocks with holes.Memory blocks with missing sections are just another variant of these type
of blocks. We can stop checking (and especially storing) present
sections. A proper error message is now printed why offlining failed.section_count was initially introduced in commit 07681215975e ("Driver
core: Add section count to memory_block struct") in order to detect when
it is okay to remove a memory block. It was used in commit 26bbe7ef6d5c
("drivers/base/memory.c: prohibit offlining of memory blocks with missing
sections") to disallow offlining memory blocks with missing sections. As
we refactored creation/removal of memory devices and have a proper check
for holes in place, we can drop the section_count.This also removes a leftover comment regarding the mem_sysfs_mutex, which
was removed in commit 848e19ad3c33 ("drivers/base/memory.c: drop the
mem_sysfs_mutex").Signed-off-by: David Hildenbrand
Signed-off-by: Andrew Morton
Cc: Greg Kroah-Hartman
Cc: "Rafael J. Wysocki"
Cc: Michal Hocko
Cc: Dan Williams
Cc: Pavel Tatashin
Cc: Anshuman Khandual
Link: http://lkml.kernel.org/r/20200127110424.5757-2-david@redhat.com
Signed-off-by: Linus Torvalds
01 Feb, 2020
2 commits
-
memory_block structure elements 'hw' and 'phys_callback' are not getting
used. This was originally added with commit 3947be1969a9 ("[PATCH]
memory hotplug: sysfs and add/remove functions") but never seem to have
been used. Just drop them now.Link: http://lkml.kernel.org/r/1576728650-13867-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual
Reviewed-by: Dan Williams
Reviewed-by: David Hildenbrand
Cc: Michal Hocko
Cc: Pavel Tatashin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Luckily, we have no users left, so we can get rid of it. Cleanup
set_migratetype_isolate() a little bit.Link: http://lkml.kernel.org/r/20191114131911.11783-2-david@redhat.com
Signed-off-by: David Hildenbrand
Reviewed-by: Greg Kroah-Hartman
Acked-by: Michal Hocko
Cc: "Rafael J. Wysocki"
Cc: Pavel Tatashin
Cc: Dan Williams
Cc: Oscar Salvador
Cc: Qian Cai
Cc: Anshuman Khandual
Cc: Pingfan Liu
Cc: Michael Ellerman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
16 Nov, 2019
1 commit
-
try_offline_node() is pretty much broken right now:
- The node span is updated when onlining memory, not when adding it. We
ignore memory that was mever onlined. Bad.- We touch possible garbage memmaps. The pfn_to_nid(pfn) can easily
trigger a kernel panic. Bad for memory that is offline but also bad
for subsection hotadd with ZONE_DEVICE, whereby the memmap of the
first PFN of a section might contain garbage.- Sections belonging to mixed nodes are not properly considered.
As memory blocks might belong to multiple nodes, we would have to walk
all pageblocks (or at least subsections) within present sections.
However, we don't have a way to identify whether a memmap that is not
online was initialized (relevant for ZONE_DEVICE). This makes things
more complicated.Luckily, we can piggy pack on the node span and the nid stored in memory
blocks. Currently, the node span is grown when calling
move_pfn_range_to_zone() - e.g., when onlining memory, and shrunk when
removing memory, before calling try_offline_node(). Sysfs links are
created via link_mem_sections(), e.g., during boot or when adding
memory.If the node still spans memory or if any memory block belongs to the
nid, we don't set the node offline. As memory blocks that span multiple
nodes cannot get offlined, the nid stored in memory blocks is reliable
enough (for such online memory blocks, the node still spans the memory).Introduce for_each_memory_block() to efficiently walk all memory blocks.
Note: We will soon stop shrinking the ZONE_DEVICE zone and the node span
when removing ZONE_DEVICE memory to fix similar issues (access of
garbage memmaps) - until we have a reliable way to identify whether
these memmaps were properly initialized. This implies later, that once
a node had ZONE_DEVICE memory, we won't be able to set a node offline -
which should be acceptable.Since commit f1dd2cd13c4b ("mm, memory_hotplug: do not associate
hotadded memory to zones until online") memory that is added is not
assoziated with a zone/node (memmap not initialized). The introducing
commit 60a5a19e7419 ("memory-hotplug: remove sysfs file of node")
already missed that we could have multiple nodes for a section and that
the zone/node span is updated when onlining pages, not when adding them.I tested this by hotplugging two DIMMs to a memory-less and cpu-less
NUMA node. The node is properly onlined when adding the DIMMs. When
removing the DIMMs, the node is properly offlined.Masayoshi Mizuma reported:
: Without this patch, memory hotplug fails as panic:
:
: BUG: kernel NULL pointer dereference, address: 0000000000000000
: ...
: Call Trace:
: remove_memory_block_devices+0x81/0xc0
: try_remove_memory+0xb4/0x130
: __remove_memory+0xa/0x20
: acpi_memory_device_remove+0x84/0x100
: acpi_bus_trim+0x57/0x90
: acpi_bus_trim+0x2e/0x90
: acpi_device_hotplug+0x2b2/0x4d0
: acpi_hotplug_work_fn+0x1a/0x30
: process_one_work+0x171/0x380
: worker_thread+0x49/0x3f0
: kthread+0xf8/0x130
: ret_from_fork+0x35/0x40[david@redhat.com: v3]
Link: http://lkml.kernel.org/r/20191102120221.7553-1-david@redhat.com
Link: http://lkml.kernel.org/r/20191028105458.28320-1-david@redhat.com
Fixes: 60a5a19e7419 ("memory-hotplug: remove sysfs file of node")
Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") # visiable after d0dc12e86b319
Signed-off-by: David Hildenbrand
Tested-by: Masayoshi Mizuma
Cc: Tang Chen
Cc: Greg Kroah-Hartman
Cc: "Rafael J. Wysocki"
Cc: Keith Busch
Cc: Jiri Olsa
Cc: "Peter Zijlstra (Intel)"
Cc: Jani Nikula
Cc: Nayna Jain
Cc: Michal Hocko
Cc: Oscar Salvador
Cc: Stephen Rothwell
Cc: Dan Williams
Cc: Pavel Tatashin
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
25 Sep, 2019
2 commits
-
Each memory block spans the same amount of sections/pages/bytes. The size
is determined before the first memory block is created. No need to store
what we can easily calculate - and the calculations even look simpler now.Michal brought up the idea of variable-sized memory blocks. However, if
we ever implement something like this, we will need an API compatibility
switch and reworks at various places (most code assumes a fixed memory
block size). So let's cleanup what we have right now.While at it, fix the variable naming in register_mem_sect_under_node() -
we no longer talk about a single section.Link: http://lkml.kernel.org/r/20190809110200.2746-1-david@redhat.com
Signed-off-by: David Hildenbrand
Cc: Greg Kroah-Hartman
Cc: "Rafael J. Wysocki"
Cc: Pavel Tatashin
Cc: Michal Hocko
Cc: Dan Williams
Cc: Oscar Salvador
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Let's validate the memory block size early, when initializing the memory
device infrastructure. Fail hard in case the value is not suitable.As nobody checks the return value of memory_dev_init(), turn it into a
void function and fail with a panic in all scenarios instead. Otherwise,
we'll crash later during boot when core/drivers expect that the memory
device infrastructure (including memory_block_size_bytes()) works as
expected.I think long term, we should move the whole memory block size
configuration (set_memory_block_size_order() and
memory_block_size_bytes()) into drivers/base/memory.c.Link: http://lkml.kernel.org/r/20190806090142.22709-1-david@redhat.com
Signed-off-by: David Hildenbrand
Cc: Greg Kroah-Hartman
Cc: "Rafael J. Wysocki"
Cc: Pavel Tatashin
Cc: Michal Hocko
Cc: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
19 Jul, 2019
5 commits
-
No longer needed, let's remove it. Also, drop the "hint" parameter
completely from "find_memory_block_by_id", as nobody needs it anymore.[david@redhat.com: v3]
Link: http://lkml.kernel.org/r/20190620183139.4352-7-david@redhat.com
[david@redhat.com: handle zero-length walks]
Link: http://lkml.kernel.org/r/1c2edc22-afd7-2211-c4c7-40e54e5007e8@redhat.com
Link: http://lkml.kernel.org/r/20190614100114.311-7-david@redhat.com
Signed-off-by: David Hildenbrand
Reviewed-by: Andrew Morton
Tested-by: Qian Cai
Cc: Greg Kroah-Hartman
Cc: "Rafael J. Wysocki"
Cc: David Hildenbrand
Cc: Stephen Rothwell
Cc: Pavel Tatashin
Cc: Andrew Banman
Cc: Mike Travis
Cc: Oscar Salvador
Cc: Michal Hocko
Cc: Wei Yang
Cc: Arun KS
Cc: Stephen Rothwell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Let's move walk_memory_blocks() to the place where memory block logic
resides and simplify it. While at it, add a type for the callback
function.Link: http://lkml.kernel.org/r/20190614100114.311-6-david@redhat.com
Signed-off-by: David Hildenbrand
Reviewed-by: Andrew Morton
Cc: Greg Kroah-Hartman
Cc: "Rafael J. Wysocki"
Cc: David Hildenbrand
Cc: Stephen Rothwell
Cc: Pavel Tatashin
Cc: Andrew Banman
Cc: Mike Travis
Cc: Oscar Salvador
Cc: Michal Hocko
Cc: Wei Yang
Cc: Arun KS
Cc: Qian Cai
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Let's factor out removing of memory block devices, which is only
necessary for memory added via add_memory() and friends that created
memory block devices. Remove the devices before calling
arch_remove_memory().This finishes factoring out memory block device handling from
arch_add_memory() and arch_remove_memory().Link: http://lkml.kernel.org/r/20190527111152.16324-10-david@redhat.com
Signed-off-by: David Hildenbrand
Reviewed-by: Dan Williams
Acked-by: Michal Hocko
Cc: Greg Kroah-Hartman
Cc: "Rafael J. Wysocki"
Cc: David Hildenbrand
Cc: "mike.travis@hpe.com"
Cc: Andrew Banman
Cc: Ingo Molnar
Cc: Alex Deucher
Cc: "David S. Miller"
Cc: Mark Brown
Cc: Chris Wilson
Cc: Oscar Salvador
Cc: Jonathan Cameron
Cc: Arun KS
Cc: Mathieu Malaterre
Cc: Andy Lutomirski
Cc: Anshuman Khandual
Cc: Ard Biesheuvel
Cc: Baoquan He
Cc: Benjamin Herrenschmidt
Cc: Borislav Petkov
Cc: Catalin Marinas
Cc: Chintan Pandya
Cc: Christophe Leroy
Cc: Dave Hansen
Cc: Fenghua Yu
Cc: Heiko Carstens
Cc: "H. Peter Anvin"
Cc: Joonsoo Kim
Cc: Jun Yao
Cc: "Kirill A. Shutemov"
Cc: Logan Gunthorpe
Cc: Mark Rutland
Cc: Masahiro Yamada
Cc: Michael Ellerman
Cc: Mike Rapoport
Cc: Nicholas Piggin
Cc: Oscar Salvador
Cc: Paul Mackerras
Cc: Pavel Tatashin
Cc: Peter Zijlstra
Cc: Qian Cai
Cc: Rich Felker
Cc: Rob Herring
Cc: Robin Murphy
Cc: Thomas Gleixner
Cc: Tony Luck
Cc: Vasily Gorbik
Cc: Wei Yang
Cc: Will Deacon
Cc: Yoshinori Sato
Cc: Yu Zhao
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Only memory to be added to the buddy and to be onlined/offlined by user
space using /sys/devices/system/memory/... needs (and should have!)
memory block devices.Factor out creation of memory block devices. Create all devices after
arch_add_memory() succeeded. We can later drop the want_memblock
parameter, because it is now effectively stale.Only after memory block devices have been added, memory can be onlined
by user space. This implies, that memory is not visible to user space
at all before arch_add_memory() succeeded.While at it
- use WARN_ON_ONCE instead of BUG_ON in moved unregister_memory()
- introduce find_memory_block_by_id() to search via block id
- Use find_memory_block_by_id() in init_memory_block() to catch
duplicatesLink: http://lkml.kernel.org/r/20190527111152.16324-8-david@redhat.com
Signed-off-by: David Hildenbrand
Reviewed-by: Pavel Tatashin
Acked-by: Michal Hocko
Cc: Greg Kroah-Hartman
Cc: "Rafael J. Wysocki"
Cc: David Hildenbrand
Cc: "mike.travis@hpe.com"
Cc: Ingo Molnar
Cc: Andrew Banman
Cc: Oscar Salvador
Cc: Qian Cai
Cc: Wei Yang
Cc: Arun KS
Cc: Mathieu Malaterre
Cc: Alex Deucher
Cc: Andy Lutomirski
Cc: Anshuman Khandual
Cc: Ard Biesheuvel
Cc: Baoquan He
Cc: Benjamin Herrenschmidt
Cc: Borislav Petkov
Cc: Catalin Marinas
Cc: Chintan Pandya
Cc: Christophe Leroy
Cc: Chris Wilson
Cc: Dan Williams
Cc: Dave Hansen
Cc: "David S. Miller"
Cc: Fenghua Yu
Cc: Heiko Carstens
Cc: "H. Peter Anvin"
Cc: Jonathan Cameron
Cc: Joonsoo Kim
Cc: Jun Yao
Cc: "Kirill A. Shutemov"
Cc: Logan Gunthorpe
Cc: Mark Brown
Cc: Mark Rutland
Cc: Masahiro Yamada
Cc: Michael Ellerman
Cc: Mike Rapoport
Cc: Nicholas Piggin
Cc: Oscar Salvador
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Rich Felker
Cc: Rob Herring
Cc: Robin Murphy
Cc: Thomas Gleixner
Cc: Tony Luck
Cc: Vasily Gorbik
Cc: Will Deacon
Cc: Yoshinori Sato
Cc: Yu Zhao
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
We want to improve error handling while adding memory by allowing to use
arch_remove_memory() and __remove_pages() even if
CONFIG_MEMORY_HOTREMOVE is not set to e.g., implement something like:arch_add_memory()
rc = do_something();
if (rc) {
arch_remove_memory();
}We won't get rid of CONFIG_MEMORY_HOTREMOVE for now, as it will require
quite some dependencies for memory offlining.Link: http://lkml.kernel.org/r/20190527111152.16324-7-david@redhat.com
Signed-off-by: David Hildenbrand
Reviewed-by: Pavel Tatashin
Cc: Tony Luck
Cc: Fenghua Yu
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Michael Ellerman
Cc: Heiko Carstens
Cc: Yoshinori Sato
Cc: Rich Felker
Cc: Dave Hansen
Cc: Andy Lutomirski
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Borislav Petkov
Cc: "H. Peter Anvin"
Cc: Greg Kroah-Hartman
Cc: "Rafael J. Wysocki"
Cc: Michal Hocko
Cc: David Hildenbrand
Cc: Oscar Salvador
Cc: "Kirill A. Shutemov"
Cc: Alex Deucher
Cc: "David S. Miller"
Cc: Mark Brown
Cc: Chris Wilson
Cc: Christophe Leroy
Cc: Nicholas Piggin
Cc: Vasily Gorbik
Cc: Rob Herring
Cc: Masahiro Yamada
Cc: "mike.travis@hpe.com"
Cc: Andrew Banman
Cc: Arun KS
Cc: Qian Cai
Cc: Mathieu Malaterre
Cc: Baoquan He
Cc: Logan Gunthorpe
Cc: Anshuman Khandual
Cc: Ard Biesheuvel
Cc: Catalin Marinas
Cc: Chintan Pandya
Cc: Dan Williams
Cc: Ingo Molnar
Cc: Jonathan Cameron
Cc: Joonsoo Kim
Cc: Jun Yao
Cc: Mark Rutland
Cc: Mike Rapoport
Cc: Oscar Salvador
Cc: Robin Murphy
Cc: Wei Yang
Cc: Will Deacon
Cc: Yu Zhao
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
15 May, 2019
1 commit
-
Failing while removing memory is mostly ignored and cannot really be
handled. Let's treat errors in unregister_memory_section() in a nice way,
warning, but continuing.Link: http://lkml.kernel.org/r/20190409100148.24703-3-david@redhat.com
Signed-off-by: David Hildenbrand
Cc: Greg Kroah-Hartman
Cc: "Rafael J. Wysocki"
Cc: Ingo Molnar
Cc: Andrew Banman
Cc: Mike Travis
Cc: David Hildenbrand
Cc: Oscar Salvador
Cc: Michal Hocko
Cc: Pavel Tatashin
Cc: Qian Cai
Cc: Wei Yang
Cc: Arun KS
Cc: Mathieu Malaterre
Cc: Andy Lutomirski
Cc: Benjamin Herrenschmidt
Cc: Borislav Petkov
Cc: Christophe Leroy
Cc: Dave Hansen
Cc: Fenghua Yu
Cc: Geert Uytterhoeven
Cc: Heiko Carstens
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: Joonsoo Kim
Cc: "Kirill A. Shutemov"
Cc: Martin Schwidefsky
Cc: Masahiro Yamada
Cc: Michael Ellerman
Cc: Mike Rapoport
Cc: Nicholas Piggin
Cc: Oscar Salvador
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Rich Felker
Cc: Rob Herring
Cc: Stefan Agner
Cc: Thomas Gleixner
Cc: Tony Luck
Cc: Vasily Gorbik
Cc: Yoshinori Sato
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
21 Jun, 2018
1 commit
-
Add a new function to "adjust" the current fixed UV memory block size
of 2GB so it can be changed to a different physical boundary. This is
out of necessity so arch dependent code can accommodate specific BIOS
requirements which can align these new PMEM modules at less than the
default boundaries.A "set order" type of function was used to insure that the memory block
size will be a power of two value without requiring a validity check.
64GB was chosen as the upper limit for memory block size values to
accommodate upcoming 4PB systems which have 6 more bits of physical
address space (46 becoming 52).Signed-off-by: Mike Travis
Reviewed-by: Andrew Banman
Cc: Andrew Morton
Cc: Dimitri Sivanich
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Russ Anderson
Cc: Thomas Gleixner
Cc: dan.j.williams@intel.com
Cc: jgross@suse.com
Cc: kirill.shutemov@linux.intel.com
Cc: mhocko@suse.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/lkml/20180524201711.609546602@stormcage.americas.sgi.com
Signed-off-by: Ingo Molnar
06 Apr, 2018
2 commits
-
During memory hotplugging we traverse struct pages three times:
1. memset(0) in sparse_add_one_section()
2. loop in __add_section() to set do: set_page_node(page, nid); and
SetPageReserved(page);
3. loop in memmap_init_zone() to call __init_single_pfn()This patch removes the first two loops, and leaves only loop 3. All
struct pages are initialized in one place, the same as it is done during
boot.The benefits:
- We improve memory hotplug performance because we are not evicting the
cache several times and also reduce loop branching overhead.- Remove condition from hotpath in __init_single_pfn(), that was added
in order to fix the problem that was reported by Bharata in the above
email thread, thus also improve performance during normal boot.- Make memory hotplug more similar to the boot memory initialization
path because we zero and initialize struct pages only in one
function.- Simplifies memory hotplug struct page initialization code, and thus
enables future improvements, such as multi-threading the
initialization of struct pages in order to improve hotplug
performance even further on larger machines.[pasha.tatashin@oracle.com: v5]
Link: http://lkml.kernel.org/r/20180228030308.1116-7-pasha.tatashin@oracle.com
Link: http://lkml.kernel.org/r/20180215165920.8570-7-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin
Reviewed-by: Ingo Molnar
Cc: Michal Hocko
Cc: Baoquan He
Cc: Bharata B Rao
Cc: Daniel Jordan
Cc: Dan Williams
Cc: Greg Kroah-Hartman
Cc: "H. Peter Anvin"
Cc: Kirill A. Shutemov
Cc: Mel Gorman
Cc: Steven Sistare
Cc: Thomas Gleixner
Cc: Vlastimil Babka
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
During memory hotplugging the probe routine will leave struct pages
uninitialized, the same as it is currently done during boot. Therefore,
we do not want to access the inside of struct pages before
__init_single_page() is called during onlining.Because during hotplug we know that pages in one memory block belong to
the same numa node, we can skip the checking. We should keep checking
for the boot case.[pasha.tatashin@oracle.com: s/register_new_memory()/hotplug_memory_register()]
Link: http://lkml.kernel.org/r/20180228030308.1116-6-pasha.tatashin@oracle.com
Link: http://lkml.kernel.org/r/20180215165920.8570-6-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin
Acked-by: Michal Hocko
Reviewed-by: Ingo Molnar
Cc: Baoquan He
Cc: Bharata B Rao
Cc: Daniel Jordan
Cc: Dan Williams
Cc: Greg Kroah-Hartman
Cc: "H. Peter Anvin"
Cc: Kirill A. Shutemov
Cc: Mel Gorman
Cc: Steven Sistare
Cc: Thomas Gleixner
Cc: Vlastimil Babka
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
02 Nov, 2017
1 commit
-
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.By default all files without license information are under the default
license of the kernel, which is GPL version 2.Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
25 Feb, 2017
1 commit
-
Commit 31bc3858ea3e ("add automatic onlining policy for the newly added
memory") provides the capability to have added memory automatically
onlined during add, but this appears to be slightly broken.The current implementation uses walk_memory_range() to call
online_memory_block, which uses memory_block_change_state() to online
the memory. Instead, we should be calling device_online() for the
memory block in online_memory_block(). This would online the memory
(the memory bus online routine memory_subsys_online() called from
device_online calls memory_block_change_state()) and properly update the
device struct offline flag.As a result of the current implementation, attempting to remove a memory
block after adding it using auto online fails. This is because doing a
remove, for instanceecho offline > /sys/devices/system/memory/memoryXXX/state
uses device_offline() which checks the dev->offline flag.
Link: http://lkml.kernel.org/r/20170222220744.8119.19687.stgit@ltcalpine2-lp14.aus.stglabs.ibm.com
Signed-off-by: Nathan Fontenot
Cc: Michael Ellerman
Cc: Michael Roth
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
18 Mar, 2016
1 commit
-
Pull char/misc updates from Greg KH:
"Here is the big char/misc driver update for 4.6-rc1.The majority of the patches here is hwtracing and some new mic
drivers, but there's a lot of other driver updates as well. Full
details in the shortlog.All have been in linux-next for a while with no reported issues"
* tag 'char-misc-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (238 commits)
goldfish: Fix build error of missing ioremap on UM
nvmem: mediatek: Fix later provider initialization
nvmem: imx-ocotp: Fix return value of imx_ocotp_read
nvmem: Fix dependencies for !HAS_IOMEM archs
char: genrtc: replace blacklist with whitelist
drivers/hwtracing: make coresight-etm-perf.c explicitly non-modular
drivers: char: mem: fix IS_ERROR_VALUE usage
char: xillybus: Fix internal data structure initialization
pch_phub: return -ENODATA if ROM can't be mapped
Drivers: hv: vmbus: Support kexec on ws2012 r2 and above
Drivers: hv: vmbus: Support handling messages on multiple CPUs
Drivers: hv: utils: Remove util transport handler from list if registration fails
Drivers: hv: util: Pass the channel information during the init call
Drivers: hv: vmbus: avoid unneeded compiler optimizations in vmbus_wait_for_unload()
Drivers: hv: vmbus: remove code duplication in message handling
Drivers: hv: vmbus: avoid wait_for_completion() on crash
Drivers: hv: vmbus: don't loose HVMSG_TIMER_EXPIRED messages
misc: at24: replace memory_accessor with nvmem_device_read
eeprom: 93xx46: extend driver to plug into the NVMEM framework
eeprom: at25: extend driver to plug into the NVMEM framework
...
16 Mar, 2016
1 commit
-
Currently, all newly added memory blocks remain in 'offline' state
unless someone onlines them, some linux distributions carry special udev
rules like:SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"
to make this happen automatically. This is not a great solution for
virtual machines where memory hotplug is being used to address high
memory pressure situations as such onlining is slow and a userspace
process doing this (udev) has a chance of being killed by the OOM killer
as it will probably require to allocate some memory.Introduce default policy for the newly added memory blocks in
/sys/devices/system/memory/auto_online_blocks file with two possible
values: "offline" which preserves the current behavior and "online"
which causes all newly added memory blocks to go online as soon as
they're added. The default is "offline".Signed-off-by: Vitaly Kuznetsov
Reviewed-by: Daniel Kiper
Cc: Jonathan Corbet
Cc: Greg Kroah-Hartman
Cc: Daniel Kiper
Cc: Dan Williams
Cc: Tang Chen
Cc: David Vrabel
Acked-by: David Rientjes
Cc: Naoya Horiguchi
Cc: Xishi Qiu
Cc: Mel Gorman
Cc: "K. Y. Srinivasan"
Cc: Igor Mammedov
Cc: Kay Sievers
Cc: Konrad Rzeszutek Wilk
Cc: Boris Ostrovsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
02 Mar, 2016
1 commit
-
Now that the AT24 uses the NVMEM framework, replace the
memory_accessor in the setup() callback with nvmem API calls.Signed-off-by: Andrew Lunn
Acked-by: Srinivas Kandagatla
Tested-by: Sekhar Nori
Acked-by: Wolfram Sang
Signed-off-by: Greg Kroah-Hartman
23 Oct, 2014
1 commit
-
drivers/base/memory.c provides a default memory_block_size_bytes()
definition explicitly marked "weak". Several architectures provide their
own definitions intended to override the default, but the "weak" attribute
on the declaration applied to the arch definitions as well, so the linker
chose one based on link order (see 10629d711ed7 ("PCI: Remove __weak
annotation from pcibios_get_phb_of_node decl")).Remove the "weak" attribute from the declaration so we always prefer a
non-weak definition over the weak one, independent of link order.Fixes: 41f107266b19 ("drivers: base: Add prototype declaration to the header file")
Signed-off-by: Bjorn Helgaas
Acked-by: Andrew Morton
CC: Rashika Kheria
CC: Nathan Fontenot
CC: Anton Blanchard
CC: Heiko Carstens
CC: Yinghai Lu
21 Dec, 2013
1 commit
-
Add prototype declaration of function memory_block_size_bytes() to
the header file include/linux/memory.h.This eliminates the following warning in memory.c:
drivers/base/memory.c:87:1: warning: no previous prototype for ‘memory_block_size_bytes’ [-Wmissing-prototypes]Signed-off-by: Rashika Kheria
Signed-off-by: Greg Kroah-Hartman
22 Aug, 2013
2 commits
-
There are two ways to set the online/offline state for a memory block:
echo 0|1 > online and echo online|online_kernel|online_movable|offline >
state.The state attribute can online a memory block with extra data, the
"online type", where the online attribute uses a default online type of
ONLINE_KEEP, same as echo online > state.Currently there is a state_mutex that provides consistency between the
memory block state and the underlying memory.The problem is that this code does a lot of things that the common
device layer can do for us, such as the serialization of the
online/offline handlers using the device lock, setting the dev->offline
field, and calling kobject_uevent().This patch refactors the online/offline code to allow the common
device_[online|offline] functions to be used. The result is a simpler
and more common code path for the two state setting mechanisms. It also
removes the state_mutex from the struct memory_block as the memory block
device lock provides the state consistency.No functional change is intended by this patch.
Signed-off-by: Seth Jennings
Signed-off-by: Greg Kroah-Hartman -
Now that add_memory_section() is only called from boot time, reduce
the logic and remove the enum.Signed-off-by: Seth Jennings
Signed-off-by: Greg Kroah-Hartman
01 May, 2013
1 commit
-
Fix the following compilation warnings:
mm/slab.c: In function `kmem_cache_init_late':
mm/slab.c:1778:2: warning: statement with no effect [-Wunused-value]mm/page_cgroup.c: In function `page_cgroup_init':
mm/page_cgroup.c:305:2: warning: statement with no effect [-Wunused-value]Signed-off-by: Vincent Stehlé
Cc: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
30 Apr, 2013
2 commits
-
__remove_pages() is only necessary for CONFIG_MEMORY_HOTREMOVE. PowerPC
pseries will return -EOPNOTSUPP if unsupported.Adding an #ifdef causes several other functions it depends on to also
become unnecessary, which saves in .text when disabled (it's disabled in
most defconfigs besides powerpc, including x86). remove_memory_block()
becomes static since it is not referenced outside of
drivers/base/memory.c.Build tested on x86 and powerpc with CONFIG_MEMORY_HOTREMOVE both enabled
and disabled.Signed-off-by: David Rientjes
Acked-by: Toshi Kani
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Greg Kroah-Hartman
Cc: Wen Congyang
Cc: Tang Chen
Cc: Yasuaki Ishimatsu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
When CONFIG_MEMORY_HOTPLUG=n, we don't want the memory-hotplug notifier
handlers to be included in the .o files, for space reasons.The existing hotplug_memory_notifier() tries to handle this but testing
with gcc-4.4.4 shows that it doesn't work - the hotplug functions are
still present in the .o files.So implement a new register_hotmemory_notifier() which is a copy of
register_hotcpu_notifier(), and which actually works as desired.
hotplug_memory_notifier() and register_memory_notifier() callsites
should be converted to use this new register_hotmemory_notifier().While we're there, let's repair the existing hotplug_memory_notifier():
it simply stomps on the register_memory_notifier() return value, so
well-behaved code cannot check for errors. Apparently non of the
existing callers were well-behaved :(Cc: Andrew Shewmaker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
13 Dec, 2012
1 commit
-
Update nodemasks management for N_MEMORY.
[lliubbo@gmail.com: fix build]
Signed-off-by: Lai Jiangshan
Signed-off-by: Wen Congyang
Cc: Christoph Lameter
Cc: Hillf Danton
Cc: Lin Feng
Cc: David Rientjes
Signed-off-by: Bob Liu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
12 Dec, 2012
1 commit
-
Currently memory_hotplug only manages the node_states[N_HIGH_MEMORY], it
forgets to manage node_states[N_NORMAL_MEMORY]. This may cause
node_states[N_NORMAL_MEMORY] to become incorrect.Example, if a node is empty before online, and we online a memory which is
in ZONE_NORMAL. And after online, node_states[N_HIGH_MEMORY] is correct,
but node_states[N_NORMAL_MEMORY] is incorrect, the online code doesn't set
the new online node to node_states[N_NORMAL_MEMORY].The same thing will happen when offlining (the offline code doesn't clear
the node from node_states[N_NORMAL_MEMORY] when needed). Some memory
managment code depends node_states[N_NORMAL_MEMORY], so we have to fix up
the node_states[N_NORMAL_MEMORY].We add node_states_check_changes_online() and
node_states_check_changes_offline() to detect whether
node_states[N_HIGH_MEMORY] and node_states[N_NORMAL_MEMORY] are changed
while hotpluging.Also add @status_change_nid_normal to struct memory_notify, thus the
memory hotplug callbacks know whether the node_states[N_NORMAL_MEMORY] are
changed. (We can add a @flags and reuse @status_change_nid instead of
introducing @status_change_nid_normal, but it will add much more
complexity in memory hotplug callback in every subsystem. So introducing
@status_change_nid_normal is better and it doesn't change the sematics of
@status_change_nid)Signed-off-by: Lai Jiangshan
Cc: David Rientjes
Cc: Minchan Kim
Cc: KOSAKI Motohiro
Cc: Yasuaki Ishimatsu
Cc: Rob Landley
Cc: Jiang Liu
Cc: Kay Sievers
Cc: Greg Kroah-Hartman
Cc: Mel Gorman
Cc: Wen Congyang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
18 Sep, 2012
1 commit
-
I found following definition in include/linux/memory.h, in my IA64
platform, SECTION_SIZE_BITS is equal to 32, and MIN_MEMORY_BLOCK_SIZE
will be 0.#define MIN_MEMORY_BLOCK_SIZE (1 << SECTION_SIZE_BITS)
Because MIN_MEMORY_BLOCK_SIZE is int type and length of 32bits,
so MIN_MEMORY_BLOCK_SIZE(1 << 32) will will equal to 0.
Actually when SECTION_SIZE_BITS >= 31, MIN_MEMORY_BLOCK_SIZE will be wrong.
This will cause wrong system memory infomation in sysfs.
I think it should be:#define MIN_MEMORY_BLOCK_SIZE (1UL << SECTION_SIZE_BITS)
And "echo offline > memory0/state" will cause following call trace:
kernel BUG at mm/memory_hotplug.c:885!
sh[6455]: bugcheck! 0 [1]
Pid: 6455, CPU 0, comm: sh
psr : 0000101008526030 ifs : 8000000000000fa4 ip : [] Not tainted (3.6.0-rc1)
ip is at offline_pages+0x210/0xee0
Call Trace:
show_stack+0x80/0xa0
show_regs+0x640/0x920
die+0x190/0x2c0
die_if_kernel+0x50/0x80
ia64_bad_break+0x3d0/0x6e0
ia64_native_leave_kernel+0x0/0x270
offline_pages+0x210/0xee0
alloc_pages_current+0x180/0x2a0Signed-off-by: Jianguo Wu
Signed-off-by: Jiang Liu
Cc: "Luck, Tony"
Reviewed-by: Michal Hocko
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
22 Dec, 2011
1 commit
-
This moves the 'memory sysdev_class' over to a regular 'memory' subsystem
and converts the devices to regular devices. The sysdev drivers are
implemented as subsystem interfaces now.After all sysdev classes are ported to regular driver core entities, the
sysdev implementation will be entirely removed from the kernel.Signed-off-by: Kay Sievers
Signed-off-by: Greg Kroah-Hartman
12 Jul, 2011
1 commit
-
The macro MIN_MEMORY_BLOCK_SIZE is currently defined twice in two .c
files, and I need it in a third one to fix a powerpc bug, so let's
first move it into a headerSigned-off-by: Benjamin Herrenschmidt
Acked-by: Ingo Molnar
04 Feb, 2011
1 commit
-
Update the 'phys_index' property of a the memory_block struct to be
called start_section_nr, and add a end_section_nr property. The
data tracked here is the same but the updated naming is more in line
with what is stored here, namely the first and last section number
that the memory block spans.The names presented to userspace remain the same, phys_index for
start_section_nr and end_phys_index for end_section_nr, to avoid breaking
anything in userspace.This also updates the node sysfs code to be aware of the new capability for
a memory block to contain multiple memory sections and be aware of the memory
block structure name changes (start_section_nr). This requires an additional
parameter to unregister_mem_sect_under_nodes so that we know which memory
section of the memory block to unregister.Signed-off-by: Nathan Fontenot
Reviewed-by: Robin Holt
Reviewed-by: KAMEZAWA Hiroyuki
Signed-off-by: Greg Kroah-Hartman
23 Oct, 2010
2 commits
-
Add a section count property to the memory_block struct to track the number
of memory sections that have been added/removed from a memory block. This
allows us to know when the last memory section of a memory block has been
removed so we can remove the memory block.Signed-off-by: Nathan Fontenot
Reviewed-by: Robin Holt
Reviewed-by: KAMEZAWA Hiroyuki
Signed-off-by: Greg Kroah-Hartman -
Introduce a find_memory_block_hinted() which utilizes the
recently added kset_find_obj_hinted().Signed-off-by: Robin Holt
To: Dave Hansen
To: Matt Tolentino
Reviewed-by: KAMEZAWA Hiroyuki
Signed-off-by: Greg Kroah-Hartman
18 Mar, 2010
1 commit
-
/sys/devices/system/memory/memoryX/phys_device is supposed to contain the
number of the physical device that the corresponding piece of memory
belongs to.In case a physical device should be replaced or taken offline for whatever
reason it is necessary to set all corresponding memory pieces offline.
The current implementation always sets phys_device to '0' and there is no
way or hook to change that. Seems like there was a plan to implement that
but it wasn't finished for whatever reason.So add a weak function which architectures can override to actually set
the phys_device from within add_memory_block().Signed-off-by: Heiko Carstens
Cc: Dave Hansen
Cc: Gerald Schaefer
Cc: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
18 Dec, 2009
1 commit
-
Memory balloon drivers can allocate a large amount of memory which is not
movable but could be freed to accomodate memory hotplug remove.Prior to calling the memory hotplug notifier chain the memory in the
pageblock is isolated. Currently, if the migrate type is not
MIGRATE_MOVABLE the isolation will not proceed, causing the memory removal
for that page range to fail.Rather than failing pageblock isolation if the migrateteype is not
MIGRATE_MOVABLE, this patch checks if all of the pages in the pageblock,
and not on the LRU, are owned by a registered balloon driver (or other
entity) using a notifier chain. If all of the non-movable pages are owned
by a balloon, they can be freed later through the memory notifier chain
and the range can still be isolated in set_migratetype_isolate().Signed-off-by: Robert Jennings
Cc: Mel Gorman
Cc: Ingo Molnar
Cc: Brian King
Cc: Paul Mackerras
Cc: Martin Schwidefsky
Cc: Gerald Schaefer
Cc: KAMEZAWA Hiroyuki
Cc: Benjamin Herrenschmidt
Signed-off-by: Andrew Morton
Signed-off-by: Benjamin Herrenschmidt
06 Apr, 2009
1 commit
-
* 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (413 commits)
tracing, net: fix net tree and tracing tree merge interaction
tracing, powerpc: fix powerpc tree and tracing tree interaction
ring-buffer: do not remove reader page from list on ring buffer free
function-graph: allow unregistering twice
trace: make argument 'mem' of trace_seq_putmem() const
tracing: add missing 'extern' keywords to trace_output.h
tracing: provide trace_seq_reserve()
blktrace: print out BLK_TN_MESSAGE properly
blktrace: extract duplidate code
blktrace: fix memory leak when freeing struct blk_io_trace
blktrace: fix blk_probes_ref chaos
blktrace: make classic output more classic
blktrace: fix off-by-one bug
blktrace: fix the original blktrace
blktrace: fix a race when creating blk_tree_root in debugfs
blktrace: fix timestamp in binary output
tracing, Text Edit Lock: cleanup
tracing: filter fix for TRACE_EVENT_FORMAT events
ftrace: Using FTRACE_WARN_ON() to check "freed record" in ftrace_release()
x86: kretprobe-booster interrupt emulation code fix
...Fix up trivial conflicts in
arch/parisc/include/asm/ftrace.h
include/linux/memory.h
kernel/extable.c
kernel/module.c
03 Apr, 2009
1 commit
-
Add an interface by which other kernel code can read/write persistent
memory such as I2C or SPI EEPROMs, or devices which provide NVRAM. Use
cases include storage of board-specific configuration data like Ethernet
addresses and sensor calibrations.Original idea, review and improvement suggestions by David Brownell.
Acked-by: David Brownell
Signed-off-by: Kevin Hilman
Cc: Jean Delvare
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
06 Mar, 2009
1 commit
-
This is an architecture independant synchronization around kernel text
modifications through use of a global mutex.A mutex has been chosen so that kprobes, the main user of this, can sleep
during memory allocation between the memory read of the instructions it
must replace and the memory write of the breakpoint.Other user of this interface: immediate values.
Paravirt and alternatives are always done when SMP is inactive, so there
is no need to use locks.Signed-off-by: Mathieu Desnoyers
LKML-Reference:
Signed-off-by: Ingo Molnar