14 Oct, 2020
6 commits
-
The 'struct resource' in 'struct dev_pagemap' is only used for holding
resource span information. The other fields, 'name', 'flags', 'desc',
'parent', 'sibling', and 'child' are all unused wasted space.This is in preparation for introducing a multi-range extension of
devm_memremap_pages().The bulk of this change is unwinding all the places internal to libnvdimm
that used 'struct resource' unnecessarily, and replacing instances of
'struct dev_pagemap'.res with 'struct dev_pagemap'.range.P2PDMA had a minor usage of the resource flags field, but only to report
failures with "%pR". That is replaced with an open coded print of the
range.[dan.carpenter@oracle.com: mm/hmm/test: use after free in dmirror_allocate_chunk()]
Link: https://lkml.kernel.org/r/20200926121402.GA7467@kadamSigned-off-by: Dan Williams
Signed-off-by: Dan Carpenter
Signed-off-by: Andrew Morton
Reviewed-by: Boris Ostrovsky [xen]
Cc: Paul Mackerras
Cc: Michael Ellerman
Cc: Benjamin Herrenschmidt
Cc: Vishal Verma
Cc: Vivek Goyal
Cc: Dave Jiang
Cc: Ben Skeggs
Cc: David Airlie
Cc: Daniel Vetter
Cc: Ira Weiny
Cc: Bjorn Helgaas
Cc: Juergen Gross
Cc: Stefano Stabellini
Cc: "Jérôme Glisse"
Cc: Andy Lutomirski
Cc: Ard Biesheuvel
Cc: Ard Biesheuvel
Cc: Borislav Petkov
Cc: Brice Goglin
Cc: Catalin Marinas
Cc: Dave Hansen
Cc: David Hildenbrand
Cc: Greg Kroah-Hartman
Cc: "H. Peter Anvin"
Cc: Hulk Robot
Cc: Ingo Molnar
Cc: Jason Gunthorpe
Cc: Jason Yan
Cc: Jeff Moyer
Cc: Jia He
Cc: Joao Martins
Cc: Jonathan Cameron
Cc: kernel test robot
Cc: Mike Rapoport
Cc: Pavel Tatashin
Cc: Peter Zijlstra
Cc: "Rafael J. Wysocki"
Cc: Randy Dunlap
Cc: Thomas Gleixner
Cc: Tom Lendacky
Cc: Wei Yang
Cc: Will Deacon
Link: https://lkml.kernel.org/r/159643103173.4062302.768998885691711532.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lkml.kernel.org/r/160106115761.30709.13539840236873663620.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds -
In preparation for introducing seed devices the dax-bus core needs to be
able to intercept ->probe() and ->remove() operations. Towards that end
arrange for the bus and drivers to switch from raw 'struct device' driver
operations to 'struct dev_dax' typed operations.Reported-by: Hulk Robot
Signed-off-by: Dan Williams
Signed-off-by: Andrew Morton
Cc: Jason Yan
Cc: Vishal Verma
Cc: Brice Goglin
Cc: Dave Hansen
Cc: Dave Jiang
Cc: David Hildenbrand
Cc: Ira Weiny
Cc: Jia He
Cc: Joao Martins
Cc: Jonathan Cameron
Cc: Andy Lutomirski
Cc: Ard Biesheuvel
Cc: Ard Biesheuvel
Cc: Benjamin Herrenschmidt
Cc: Ben Skeggs
Cc: Bjorn Helgaas
Cc: Borislav Petkov
Cc: Boris Ostrovsky
Cc: Catalin Marinas
Cc: Daniel Vetter
Cc: David Airlie
Cc: Greg Kroah-Hartman
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: Jason Gunthorpe
Cc: Jeff Moyer
Cc: "Jérôme Glisse"
Cc: Juergen Gross
Cc: kernel test robot
Cc: Michael Ellerman
Cc: Mike Rapoport
Cc: Paul Mackerras
Cc: Pavel Tatashin
Cc: Peter Zijlstra
Cc: "Rafael J. Wysocki"
Cc: Randy Dunlap
Cc: Stefano Stabellini
Cc: Thomas Gleixner
Cc: Tom Lendacky
Cc: Vivek Goyal
Cc: Wei Yang
Cc: Will Deacon
Link: https://lkml.kernel.org/r/160106113357.30709.4541750544799737855.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds -
In preparation for a facility that enables dax regions to be sub-divided,
introduce infrastructure to track and allocate region capacity.The new dax_region/available_size attribute is only enabled for volatile
hmem devices, not pmem devices that are defined by nvdimm namespace
boundaries. This is per Jeff's feedback the last time dynamic device-dax
capacity allocation support was discussed.Signed-off-by: Dan Williams
Signed-off-by: Andrew Morton
Cc: Vishal Verma
Cc: Brice Goglin
Cc: Dave Hansen
Cc: Dave Jiang
Cc: David Hildenbrand
Cc: Ira Weiny
Cc: Jia He
Cc: Joao Martins
Cc: Jonathan Cameron
Cc: Andy Lutomirski
Cc: Ard Biesheuvel
Cc: Ard Biesheuvel
Cc: Benjamin Herrenschmidt
Cc: Ben Skeggs
Cc: Bjorn Helgaas
Cc: Borislav Petkov
Cc: Boris Ostrovsky
Cc: Catalin Marinas
Cc: Daniel Vetter
Cc: David Airlie
Cc: Greg Kroah-Hartman
Cc: "H. Peter Anvin"
Cc: Hulk Robot
Cc: Ingo Molnar
Cc: Jason Gunthorpe
Cc: Jason Yan
Cc: Jeff Moyer
Cc: "Jérôme Glisse"
Cc: Juergen Gross
Cc: kernel test robot
Cc: Michael Ellerman
Cc: Mike Rapoport
Cc: Paul Mackerras
Cc: Pavel Tatashin
Cc: Peter Zijlstra
Cc: "Rafael J. Wysocki"
Cc: Randy Dunlap
Cc: Stefano Stabellini
Cc: Thomas Gleixner
Cc: Tom Lendacky
Cc: Vivek Goyal
Cc: Wei Yang
Cc: Will Deacon
Link: https://lore.kernel.org/linux-nvdimm/x49shpp3zn8.fsf@segfault.boston.devel.redhat.com
Link: https://lkml.kernel.org/r/159643101035.4062302.6785857915652647857.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lkml.kernel.org/r/160106112801.30709.14601438735305335071.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds -
The passed in dev_pagemap is only required in the pmem case as the
libnvdimm core may have reserved a vmem_altmap for dev_memremap_pages() to
place the memmap in pmem directly. In the hmem case there is no agent
reserving an altmap so it can all be handled by a core internal default.Pass the resource range via a new @range property of 'struct
dev_dax_data'.Signed-off-by: Dan Williams
Signed-off-by: Andrew Morton
Cc: David Hildenbrand
Cc: Vishal Verma
Cc: Dave Hansen
Cc: Pavel Tatashin
Cc: Brice Goglin
Cc: Dave Jiang
Cc: Ira Weiny
Cc: Jia He
Cc: Joao Martins
Cc: Jonathan Cameron
Cc: Andy Lutomirski
Cc: Ard Biesheuvel
Cc: Ard Biesheuvel
Cc: Benjamin Herrenschmidt
Cc: Ben Skeggs
Cc: Bjorn Helgaas
Cc: Borislav Petkov
Cc: Boris Ostrovsky
Cc: Catalin Marinas
Cc: Daniel Vetter
Cc: David Airlie
Cc: Greg Kroah-Hartman
Cc: "H. Peter Anvin"
Cc: Hulk Robot
Cc: Ingo Molnar
Cc: Jason Gunthorpe
Cc: Jason Yan
Cc: Jeff Moyer
Cc: "Jérôme Glisse"
Cc: Juergen Gross
Cc: kernel test robot
Cc: Michael Ellerman
Cc: Mike Rapoport
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: "Rafael J. Wysocki"
Cc: Randy Dunlap
Cc: Stefano Stabellini
Cc: Thomas Gleixner
Cc: Tom Lendacky
Cc: Vivek Goyal
Cc: Wei Yang
Cc: Will Deacon
Link: https://lkml.kernel.org/r/159643099958.4062302.10379230791041872886.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lkml.kernel.org/r/160106110513.30709.4303239334850606031.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds -
In preparation for adding more parameters to instance creation, move
existing parameters to a new struct.Signed-off-by: Dan Williams
Signed-off-by: Andrew Morton
Cc: Vishal Verma
Cc: Andy Lutomirski
Cc: Ard Biesheuvel
Cc: Benjamin Herrenschmidt
Cc: Ben Skeggs
Cc: Borislav Petkov
Cc: Brice Goglin
Cc: Catalin Marinas
Cc: Daniel Vetter
Cc: Dave Hansen
Cc: Dave Jiang
Cc: David Airlie
Cc: David Hildenbrand
Cc: Greg Kroah-Hartman
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: Ira Weiny
Cc: Jason Gunthorpe
Cc: Jeff Moyer
Cc: Jia He
Cc: Joao Martins
Cc: Jonathan Cameron
Cc: Michael Ellerman
Cc: Mike Rapoport
Cc: Paul Mackerras
Cc: Pavel Tatashin
Cc: Peter Zijlstra
Cc: "Rafael J. Wysocki"
Cc: Thomas Gleixner
Cc: Tom Lendacky
Cc: Wei Yang
Cc: Will Deacon
Cc: Ard Biesheuvel
Cc: Bjorn Helgaas
Cc: Boris Ostrovsky
Cc: Hulk Robot
Cc: Jason Yan
Cc: "Jérôme Glisse"
Cc: Juergen Gross
Cc: kernel test robot
Cc: Randy Dunlap
Cc: Stefano Stabellini
Cc: Vivek Goyal
Link: https://lkml.kernel.org/r/159643099411.4062302.1337305960720423895.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds -
All callers specify the same flags to alloc_dax_region(), so there is no
need to allow for anything other than PFN_DEV|PFN_MAP, or carry a
->pfn_flags around on the region. Device-dax instances are always page
backed.Signed-off-by: Dan Williams
Signed-off-by: Andrew Morton
Cc: Vishal Verma
Cc: Andy Lutomirski
Cc: Ard Biesheuvel
Cc: Benjamin Herrenschmidt
Cc: Ben Skeggs
Cc: Borislav Petkov
Cc: Brice Goglin
Cc: Catalin Marinas
Cc: Daniel Vetter
Cc: Dave Hansen
Cc: Dave Jiang
Cc: David Airlie
Cc: David Hildenbrand
Cc: Greg Kroah-Hartman
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: Ira Weiny
Cc: Jason Gunthorpe
Cc: Jeff Moyer
Cc: Jia He
Cc: Joao Martins
Cc: Jonathan Cameron
Cc: Michael Ellerman
Cc: Mike Rapoport
Cc: Paul Mackerras
Cc: Pavel Tatashin
Cc: Peter Zijlstra
Cc: "Rafael J. Wysocki"
Cc: Thomas Gleixner
Cc: Tom Lendacky
Cc: Wei Yang
Cc: Will Deacon
Cc: Ard Biesheuvel
Cc: Bjorn Helgaas
Cc: Boris Ostrovsky
Cc: Hulk Robot
Cc: Jason Yan
Cc: "Jérôme Glisse"
Cc: Juergen Gross
Cc: kernel test robot
Cc: Randy Dunlap
Cc: Stefano Stabellini
Cc: Vivek Goyal
Link: https://lkml.kernel.org/r/159643098829.4062302.13611520567669439046.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds
07 Nov, 2019
1 commit
-
PFN flags are (unsigned long long), fix the alloc_dax_region() calling
convention to fix warnings of the form:>> include/linux/pfn_t.h:18:17: warning: large integer implicitly truncated to unsigned type [-Woverflow]
#define PFN_DEV (1ULL << (BITS_PER_LONG_LONG - 3))Reported-by: kbuild test robot
Signed-off-by: Dan Williams
Acked-by: Thomas Gleixner
Signed-off-by: Rafael J. Wysocki
07 Jan, 2019
6 commits
-
Persistent memory, as described by the ACPI NFIT (NVDIMM Firmware
Interface Table), is the first known instance of a memory range
described by a unique "target" proximity domain. Where "initiator" and
"target" proximity domains is an approach that the ACPI HMAT
(Heterogeneous Memory Attributes Table) uses to described the unique
performance properties of a memory range relative to a given initiator
(e.g. CPU or DMA device).Currently the numa-node for a /dev/pmemX block-device or /dev/daxX.Y
char-device follows the traditional notion of 'numa-node' where the
attribute conveys the closest online numa-node. That numa-node attribute
is useful for cpu-binding and memory-binding processes *near* the
device. However, when the memory range backing a 'pmem', or 'dax' device
is onlined (memory hot-add) the memory-only-numa-node representing that
address needs to be differentiated from the set of online nodes. In
other words, the numa-node association of the device depends on whether
you can bind processes *near* the cpu-numa-node in the offline
device-case, or bind process *on* the memory-range directly after the
backing address range is onlined.Allow for the case that platform firmware describes persistent memory
with a unique proximity domain, i.e. when it is distinct from the
proximity of DRAM and CPUs that are on the same socket. Plumb the Linux
numa-node translation of that proximity through the libnvdimm region
device to namespaces that are in device-dax mode. With this in place the
proposed kmem driver [1] can optionally discover a unique numa-node
number for the address range as it transitions the memory from an
offline state managed by a device-driver to an online memory range
managed by the core-mm.[1]: https://lore.kernel.org/lkml/20181022201317.8558C1D8@viggo.jf.intel.com
Reported-by: Fan Du
Cc: Michael Ellerman
Cc: "Oliver O'Halloran"
Cc: Dave Hansen
Cc: Jérôme Glisse
Reviewed-by: Yang Shi
Signed-off-by: Dan Williams -
On the expectation that some environments may not upgrade libdaxctl
(userspace component that depends on the /sys/class/dax hierarchy),
provide a default / legacy dax_pmem_compat driver. The dax_pmem_compat
driver implements the original /sys/class/dax sysfs layout rather than
/sys/bus/dax. When userspace is upgraded it can blacklist this module
and switch to the dax_pmem driver going forward.CONFIG_DEV_DAX_PMEM_COMPAT and supporting code will be deleted according
to the dax_pmem entry in Documentation/ABI/obsolete/.Signed-off-by: Dan Williams
-
Introduce the 'new_id' concept for enabling a custom device-driver attach
policy for dax-bus drivers. The intended use is to have a mechanism for
hot-plugging device-dax ranges into the page allocator on-demand. With
this in place the default policy of using device-dax for performance
differentiated memory can be overridden by user-space policy that can
arrange for the memory range to be managed as 'System RAM' with
user-defined NUMA and other performance attributes.Signed-off-by: Dan Williams
-
Move the responsibility of calling devm_request_resource() and
devm_memremap_pages() into the common device-dax driver. This is another
preparatory step to allowing an alternate personality driver for a
device-dax range.Signed-off-by: Dan Williams
-
In support of multiple device-dax instances per device-dax-region and
allowing the 'kmem' driver to attach to dax-instances instead of the
current device-node access, convert the dax sub-system from a class to a
bus. Recall that the kmem driver takes reserved / special purpose
memories and assigns them to be managed by the core-mm.Aside from the fact the device-dax instances are registered and probed
on a bus, two other lifetime-management changes are made:1/ Delay attaching a cdev until driver probe time
2/ A new run_dax() helper is introduced to allow restoring dax-operation
after a kill_dax() event. So, at driver ->probe() time we run_dax()
and at ->remove() time we kill_dax() and invalidate all mappings.Signed-off-by: Dan Williams
-
Towards eliminating the dax_class, move the dax-device-attribute
enabling to a new bus.c file in the core. The amount of code
thrash of sub-sequent patches is reduced as no logic changes are made,
just pure code movement.A temporary export of unregister_dex_dax() and dax_attribute_groups is
needed to preserve compilation, but those symbols become static again in
a follow-on patch.Signed-off-by: Dan Williams