29 Jul, 2020
1 commit
-
Move libnvdimm sysfs attributes that currently use an open coded
DEVICE_ATTR() to hide sensitive root-only information (physical memory
layout) to the new DEVICE_ATTR_ADMIN_RO() helper.Cc: Vishal Verma
Cc: Dave Jiang
Cc: Ira Weiny
Signed-off-by: Dan Williams
Signed-off-by: Vishal Verma
18 Mar, 2020
3 commits
-
The align attribute applies an alignment constraint for namespace
creation in a region. Whereas the 'align' attribute of a namespace
applied alignment padding via an info block, the 'align' attribute
applies alignment constraints to the free space allocation.The default for 'align' is the maximum known memremap_compat_align()
across all archs (16MiB from PowerPC at time of writing) multiplied by
the number of interleave ways if there is blk-aliasing. The minimum is
PAGE_SIZE and allows for the creation of cross-arch incompatible
namespaces, just as previous kernels allowed, but the expectation is
cross-arch and mode-independent compatibility by default.The regression risk with this change is limited to cases that were
dependent on the ability to create unaligned namespaces, *and* for some
reason are unable to opt-out of aligned namespaces by writing to
'regionX/align'. If such a scenario arises the default can be flipped
from opt-out to opt-in of compat-aligned namespace creation, but that is
a last resort. The kernel will otherwise continue to support existing
defined misaligned namespaces.Unfortunately this change needs to touch several parts of the
implementation at once:- region/available_size: expand busy extents to current align
- region/max_available_extent: expand busy extents to current align
- namespace/size: trim free space to current align...to keep the free space accounting conforming to the dynamic align
setting.Reported-by: Aneesh Kumar K.V
Reported-by: Jeff Moyer
Reviewed-by: Aneesh Kumar K.V
Reviewed-by: Jeff Moyer
Link: https://lore.kernel.org/r/158041478371.3889308.14542630147672668068.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams -
The NDD_ALIASING flag is used to indicate where pmem capacity might
alias with blk capacity and require labeling. It is also used to
indicate whether the DIMM supports labeling. Separate this latter
capability into its own flag so that the NDD_ALIASING flag is scoped to
true aliased configurations.To my knowledge aliased configurations only exist in the ACPI spec,
there are no known platforms that ship this support in production.This clarity allows namespace-capacity alignment constraints around
interleave-ways to be relaxed.Cc: Vishal Verma
Cc: Oliver O'Halloran
Reviewed-by: Jeff Moyer
Reviewed-by: Aneesh Kumar K.V
Link: https://lore.kernel.org/r/158041477856.3889308.4212605617834097674.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams -
The pmem driver on PowerPC crashes with the following signature when
instantiating misaligned namespaces that map their capacity via
memremap_pages().BUG: Unable to handle kernel data access at 0xc001000406000000
Faulting instruction address: 0xc000000000090790
NIP [c000000000090790] arch_add_memory+0xc0/0x130
LR [c000000000090744] arch_add_memory+0x74/0x130
Call Trace:
arch_add_memory+0x74/0x130 (unreliable)
memremap_pages+0x74c/0xa30
devm_memremap_pages+0x3c/0xa0
pmem_attach_disk+0x188/0x770
nvdimm_bus_probe+0xd8/0x470With the assumption that only memremap_pages() has alignment
constraints, enforce memremap_compat_align() for
pmem_should_map_pages(), nd_pfn, and nd_dax cases. This includes
preventing the creation of namespaces where the base address is
misaligned and cases there infoblock padding parameters are invalid.Reported-by: Aneesh Kumar K.V
Cc: Jeff Moyer
Fixes: a3619190d62e ("libnvdimm/pfn: stop padding pmem namespaces to section alignment")
Reviewed-by: Aneesh Kumar K.V
Signed-off-by: Dan Williams
20 Nov, 2019
1 commit
-
Rather than update the permission in ->is_visible() set the permission
directly at declaration time.Cc: Ira Weiny
Cc: Vishal Verma
Signed-off-by: Dan Williams
Reviewed-by: Aneesh Kumar K.V
Link: https://lore.kernel.org/r/157309905534.1582359.13927459228885931097.stgit@dwillia2-desk3.amr.corp.intel.com
18 Nov, 2019
1 commit
-
Statically initialize the attribute groups for each libnvdimm
device_type. This is a preparation step for removing unnecessary exports
of attributes that can be included in the device_type by default.Also take the opportunity to mark 'struct device_type' instances const.
Cc: Ira Weiny
Cc: Vishal Verma
Signed-off-by: Dan Williams
Reviewed-by: Aneesh Kumar K.V
Link: https://lore.kernel.org/r/157309900111.1582359.2445687530383470348.stgit@dwillia2-desk3.amr.corp.intel.com
15 Nov, 2019
2 commits
-
The nvdimm core currently maps the full namespace to an ioremap range
while probing the namespace mode. This can result in probe failures on
architectures that have limited ioremap space.For example, with a large btt namespace that consumes most of I/O remap
range, depending on the sequence of namespace initialization, the user
can find a pfn namespace initialization failure due to unavailable I/O
remap space which nvdimm core uses for temporary mapping.nvdimm core can avoid this failure by only mapping the reserved info
block area to check for pfn superblock type and map the full namespace
resource only before using the namespace.Given that personalities like BTT can be layered on top of any namespace
type create a generic form of devm_nsio_enable (devm_namespace_enable)
and use it inside the per-personality attach routines. Now
devm_namespace_enable() is always paired with disable unless the mapping
is going to be used for long term runtime access.Signed-off-by: Aneesh Kumar K.V
Link: https://lore.kernel.org/r/20191017073308.32645-1-aneesh.kumar@linux.ibm.com
[djbw: reworks to move devm_namespace_{en,dis}able into *attach helpers]
Reported-by: kbuild test robot
Link: https://lore.kernel.org/r/20191031105741.102793-2-aneesh.kumar@linux.ibm.com
Signed-off-by: Dan Williams -
Don't leave claim_class set to an invalid value if an error occurs in
btt_claim_class().While we are here change the return type of __holder_class_store() to be
clear about the values it is returning.This was found via code inspection.
Reported-by: Dan Carpenter
Reviewed-by: Vishal Verma
Signed-off-by: Ira Weiny
Link: https://lore.kernel.org/r/20190925211348.14082-1-ira.weiny@intel.com
Signed-off-by: Dan Williams
25 Sep, 2019
1 commit
-
nd_label->dpa issue was observed when trying to enable the namespace created
with little-endian kernel on a big-endian kernel. That made me run
`sparse` on the rest of the code and other changes are the result of that.Fixes: d9b83c756953 ("libnvdimm, btt: rework error clearing")
Fixes: 9dedc73a4658 ("libnvdimm/btt: Fix LBA masking during 'free list' population")
Reviewed-by: Vishal Verma
Signed-off-by: Aneesh Kumar K.V
Link: https://lore.kernel.org/r/20190809074726.27815-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Dan Williams
06 Sep, 2019
2 commits
-
Architectures have different page size than 4K. Use the PAGE_SIZE
to make sure ranges are correctly aligned.Signed-off-by: Aneesh Kumar K.V
Link: https://lore.kernel.org/r/20190905154603.10349-7-aneesh.kumar@linux.ibm.com
Signed-off-by: Dan Williams -
The nd_region_probe_success() helper collides seed management with
nvdimm->busy tracking. Given the 'busy' increment is handled internal to the
nd_region driver 'probe' path move the decrement to the 'remove' path.
With that cleanup the routine can be renamed to the more descriptive
nd_region_advance_seeds().The change is prompted by an incoming need to optionally advance the
seeds on other events besides 'probe' success.Cc: "Aneesh Kumar K.V"
Signed-off-by: Dan Williams
Signed-off-by: Aneesh Kumar K.V
Link: https://lore.kernel.org/r/20190905154603.10349-2-aneesh.kumar@linux.ibm.com
Signed-off-by: Dan Williams
27 Jul, 2019
1 commit
-
Pull libnvdimm fixes from Dan Williams:
"A collection of locking and async operations fixes for v5.3-rc2. These
had been soaking in a branch targeting the merge window, but missed
due to a regression hunt. This fixed up version has otherwise been in
-next this past week with no reported issues.In order to gain confidence in the locking changes the pull also
includes a debug / instrumentation patch to enable lockdep coverage
for libnvdimm subsystem operations that depend on the device_lock for
exclusion. As mentioned in the changelog it is a hack, but it works
and documents the locking expectations of the sub-system in a way that
others can use lockdep to verify. The driver core touches got an ack
from Greg.Summary:
- Fix duplicate device_unregister() calls (multiple threads competing
to do unregister work when scheduling device removal from a sysfs
attribute of the self-same device).- Fix badblocks registration order bug. Ensure region badblocks are
initialized in advance of namespace registration.- Fix a deadlock between the bus lock and probe operations.
- Export device-core infrastructure to coordinate async operations
via the device ->dead state.- Add device-core infrastructure to validate device_lock() usage with
lockdep"* tag 'libnvdimm-fixes-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
driver-core, libnvdimm: Let device subsystems add local lockdep coverage
libnvdimm/bus: Fix wait_nvdimm_bus_probe_idle() ABBA deadlock
libnvdimm/bus: Stop holding nvdimm_bus_list_mutex over __nd_ioctl()
libnvdimm/bus: Prepare the nd_ioctl() path to be re-entrant
libnvdimm/region: Register badblocks before namespaces
libnvdimm/bus: Prevent duplicate device_unregister() calls
drivers/base: Introduce kill_device()
19 Jul, 2019
1 commit
-
For good reason, the standard device_lock() is marked
lockdep_set_novalidate_class() because there is simply no sane way to
describe the myriad ways the device_lock() ordered with other locks.
However, that leaves subsystems that know their own local device_lock()
ordering rules to find lock ordering mistakes manually. Instead,
introduce an optional / additional lockdep-enabled lock that a subsystem
can acquire in all the same paths that the device_lock() is acquired.A conversion of the NFIT driver and NVDIMM subsystem to a
lockdep-validate device_lock() scheme is included. The
debug_nvdimm_lock() implementation implements the correct lock-class and
stacking order for the libnvdimm device topology hierarchy.Yes, this is a hack, but hopefully it is a useful hack for other
subsystems device_lock() debug sessions. Quoting Greg:"Yeah, it feels a bit hacky but it's really up to a subsystem to mess up
using it as much as anything else, so user beware :)I don't object to it if it makes things easier for you to debug."
Cc: Ingo Molnar
Cc: Ira Weiny
Cc: Will Deacon
Cc: Dave Jiang
Cc: Keith Busch
Cc: Peter Zijlstra
Cc: Vishal Verma
Cc: "Rafael J. Wysocki"
Cc: Greg Kroah-Hartman
Signed-off-by: Dan Williams
Acked-by: Greg Kroah-Hartman
Reviewed-by: Ira Weiny
Link: https://lore.kernel.org/r/156341210661.292348.7014034644265455704.stgit@dwillia2-desk3.amr.corp.intel.com
06 Jul, 2019
1 commit
-
There is no need for caller to know how uuid_t type is constructed. Thus,
whenever we use it the implementation details are not needed. Drop it for good.Signed-off-by: Andy Shevchenko
Signed-off-by: Dan Williams
05 Jun, 2019
1 commit
-
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of version 2 of the gnu general public license as
published by the free software foundation this program is
distributed in the hope that it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more detailsextracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 64 file(s).
Signed-off-by: Thomas Gleixner
Reviewed-by: Alexios Zavras
Reviewed-by: Allison Randal
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.894819585@linutronix.de
Signed-off-by: Greg Kroah-Hartman
01 May, 2019
1 commit
-
Users have reported intermittent occurrences of DIMM initialization
failures due to duplicate allocations of address capacity detected in
the labels, or errors of the form below, both have the same root cause.nd namespace1.4: failed to track label: 0
WARNING: CPU: 17 PID: 1381 at drivers/nvdimm/label.c:863RIP: 0010:__pmem_label_update+0x56c/0x590 [libnvdimm]
Call Trace:
? nd_pmem_namespace_label_update+0xd6/0x160 [libnvdimm]
nd_pmem_namespace_label_update+0xd6/0x160 [libnvdimm]
uuid_store+0x17e/0x190 [libnvdimm]
kernfs_fop_write+0xf0/0x1a0
vfs_write+0xb7/0x1b0
ksys_write+0x57/0xd0
do_syscall_64+0x60/0x210Unfortunately those reports were typically with a busy parallel
namespace creation / destruction loop making it difficult to see the
components of the bug. However, Jane provided a simple reproducer using
the work-in-progress sub-section implementation.When ndctl is reconfiguring a namespace it may take an existing defunct
/ disabled namespace and reconfigure it with a new uuid and other
parameters. Critically namespace_update_uuid() takes existing address
resources and renames them for the new namespace to use / reconfigure as
it sees fit. The bug is that this rename only happens in the resource
tracking tree. Existing labels with the old uuid are not reaped leading
to a scenario where multiple active labels reference the same span of
address range.Teach namespace_update_uuid() to flag any references to the old uuid for
reaping at the next label update attempt.Cc:
Fixes: bf9bccc14c05 ("libnvdimm: pmem label sets and namespace instantiation")
Link: https://github.com/pmem/ndctl/issues/91
Reported-by: Jane Chu
Reported-by: Jeff Moyer
Reported-by: Erwin Tsaur
Cc: Johannes Thumshirn
Signed-off-by: Dan Williams
23 Mar, 2019
1 commit
-
In case kmemdup fails, the fix goes to blk_err to avoid NULL
pointer dereference.Signed-off-by: Kangjie Lu
Signed-off-by: Dan Williams
12 Mar, 2019
1 commit
-
Merge the initial lead-in cleanups and fixes that resulted from the
effort to resolve bugs in the section-alignment padding implementation
in the nvdimm core. The back half of this topic is abandoned in favor of
implementing sub-section hotplug support.
05 Mar, 2019
1 commit
-
Use sysfs_streq() in place of open-coded strcmp()'s that check for an
optional "\n" at the end of the input.Reviewed-by: Vishal Verma
Signed-off-by: Dan Williams
13 Feb, 2019
1 commit
-
For recovery, where non-dax access is needed to a given physical address
range, and testing, allow the 'force_raw' attribute to override the
default establishment of a dev_pagemap.Otherwise without this capability it is possible to end up with a
namespace that can not be activated due to corrupted info-block, and one
that can not be repaired due to a section collision.Cc:
Fixes: 004f1afbe199 ("libnvdimm, pmem: direct map legacy pmem by default")
Signed-off-by: Dan Williams
03 Feb, 2019
1 commit
-
As Dexuan reports the NVDIMM_FAMILY_HYPERV platform is incompatible with
the existing Linux namespace implementation because it uses
NSLABEL_FLAG_LOCAL for x1-width PMEM interleave sets. Quirk it as an
platform / DIMM that does not provide BLK-aperture access. Allow the
libnvdimm core to assume no potential for aliasing. In case other
implementations make the same mistake, provide a "noblk" module
parameter to force-enable the quirk.Link: https://lkml.kernel.org/r/PU1P153MB0169977604493B82B662A01CBF920@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM
Reported-by: Dexuan Cui
Tested-by: Dexuan Cui
Signed-off-by: Dan Williams
11 Dec, 2018
1 commit
-
kstrndup() takes care of '\0' terminator for the strings.
Use it here instead of kmemdup() + explicit terminating the input string.
Signed-off-by: Andy Shevchenko
Signed-off-by: Dave Jiang
Signed-off-by: Dan Williams
02 Oct, 2018
1 commit
-
The variable dev-parent is assigned twice with the same &nd_region->dev.
I think we could drop the second one.Signed-off-by: GuangZhe Fu
Signed-off-by: Dan Williams
26 Jul, 2018
1 commit
-
This patch will find the max contiguous area to determine the largest
pmem namespace size that can be created. If the requested size exceeds
the largest available, ENOSPC error will be returned.This fixes the allocation underrun error and wrong error return code
that have otherwise been observed as the following kernel warning:WARNING: CPU: PID: at drivers/nvdimm/namespace_devs.c:913 size_store
Fixes: a1f3e4d6a0c3 ("libnvdimm, region: update nd_region_available_dpa() for multi-pmem support")
Cc:
Signed-off-by: Keith Busch
Reviewed-by: Vishal Verma
Signed-off-by: Dave Jiang
15 Jul, 2018
1 commit
-
When a DIMM is locked its namespace label area may not be. Introduce the
distinction of locked namespaces to allow namespace enumeration while
the capacity is locked.Signed-off-by: Dan Williams
07 Apr, 2018
1 commit
-
The following NULL dereference results from incorrectly assuming that
ndd is valid in this print:struct nvdimm_drvdata *ndd = to_ndd(&nd_region->mapping[i]);
/*
* Give up if we don't find an instance of a uuid at each
* position (from 0 to nd_region->ndr_mappings - 1), or if we
* find a dimm with two instances of the same uuid.
*/
dev_err(&nd_region->dev, "%s missing label for %pUb\n",
dev_name(ndd->dev), nd_label->uuid);BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
IP: nd_region_register_namespaces+0xd67/0x13c0 [libnvdimm]
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 43 PID: 673 Comm: kworker/u609:10 Not tainted 4.16.0-rc4+ #1
[..]
RIP: 0010:nd_region_register_namespaces+0xd67/0x13c0 [libnvdimm]
[..]
Call Trace:
? devres_add+0x2f/0x40
? devm_kmalloc+0x52/0x60
? nd_region_activate+0x9c/0x320 [libnvdimm]
nd_region_probe+0x94/0x260 [libnvdimm]
? kernfs_add_one+0xe4/0x130
nvdimm_bus_probe+0x63/0x100 [libnvdimm]Switch to using the nvdimm device directly.
Fixes: 0e3b0d123c8f ("libnvdimm, namespace: allow multiple pmem...")
Cc:
Reported-by: Dave Jiang
Signed-off-by: Dan Williams
07 Mar, 2018
1 commit
-
Dynamic debug can be instructed to add the function name to the debug
output using the +f switch, so there is no need for the libnvdimm
modules to do it again. If a user decides to add the +f switch for
libnvdimm's dynamic debug this results in double prints of the function
name.Reported-by: Johannes Thumshirn
Reported-by: Ross Zwisler
Signed-off-by: Dan Williams
03 Feb, 2018
1 commit
-
Pointer nd_mapping is being initialized to a value that is never read,
instead it is being updated to a new value in all the cases where it
is being read afterwards, hence the initialization is redundant and
can be removed.Cleans up clang warning:
drivers/nvdimm/namespace_devs.c:2411:21: warning: Value stored to
'nd_mapping' during its initialization is never reaSigned-off-by: Colin Ian King
Reviewed-by: Ross Zwisler
Signed-off-by: Ross Zwisler
08 Oct, 2017
1 commit
-
The functions create_namespace_pmem and create_namespace_blk are local
to the source and do not need to be in global scope, so make them static.Cleans up sparse warnings:
symbol 'create_namespace_pmem' was not declared. Should it be static?
symbol 'create_namespace_blk' was not declared. Should it be static?Signed-off-by: Colin Ian King
Reviewed-by: Johannes Thumshirn
Signed-off-by: Dan Williams
29 Sep, 2017
1 commit
-
For the same reason that /proc/iomem returns 0's for non-root readers
and acpi tables are root-only, make the 'resource' attribute for
namespace devices only readable by root. Otherwise we disclose physical
address information.Fixes: bf9bccc14c05 ("libnvdimm: pmem label sets and namespace instantiation")
Cc:
Reported-by: Dave Hansen
Signed-off-by: Dan Williams
19 Sep, 2017
1 commit
-
Maurice reports:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
IP: holder_class_store+0x253/0x2b0 [libnvdimm]...while trying to reconfigure an NVDIMM-N namespace into 'sector' /
'btt' mode. The crash points to this line:(gdb) li *(holder_class_store+0x253)
0x7773 is in holder_class_store (drivers/nvdimm/namespace_devs.c:1420).
1415 for (i = 0; i < nd_region->ndr_mappings; i++) {
1416 struct nd_mapping *nd_mapping = &nd_region->mapping[i];
1417 struct nvdimm_drvdata *ndd = to_ndd(nd_mapping);
1418 struct nd_namespace_index *nsindex;
1419
1420 nsindex = to_namespace_index(ndd, ndd->ns_current);...where we are failing because ndd is NULL due to NVDIMM-N dimms not
supporting labels.Long story short, default to the BTTv1 format in the label-less /
NVDIMM-N case.Fixes: 14e494542636 ("libnvdimm, btt: BTT updates for UEFI 2.7 format")
Cc:
Cc: Vishal Verma
Reported-by: Maurice A. Saldivar
Tested-by: Maurice A. Saldivar
Signed-off-by: Dan Williams
12 Aug, 2017
1 commit
-
Prepare for other another consumer of this size selection scheme that is
not a 'sector size'.Cc: Oliver O'Halloran
Signed-off-by: Dan Williams
04 Jul, 2017
1 commit
30 Jun, 2017
1 commit
-
The UEFI 2.7 specification defines an updated BTT metadata format,
bumping the revision to 2.0. Add support for the new format, while
retaining compatibility for the old 1.1 format.Cc: Toshi Kani
Cc: Linda Knippers
Cc: Dan Williams
Signed-off-by: Vishal Verma
Signed-off-by: Dan Williams
28 Jun, 2017
2 commits
-
Allow volatile nfit ranges to participate in all the same infrastructure
provided for persistent memory regions. A resulting resulting namespace
device will still be called "pmem", but the parent region type will be
"nd_volatile". This is in preparation for disabling the dax ->flush()
operation in the pmem driver when it is hosted on a volatile range.Cc: Jan Kara
Cc: Jeff Moyer
Cc: Christoph Hellwig
Cc: Matthew Wilcox
Cc: Ross Zwisler
Signed-off-by: Dan Williams -
Now that all callers of the pmem api have been converted to dax helpers that
call back to the pmem driver, we can remove include/linux/pmem.h and
asm/pmem.h.Cc:
Cc: Jeff Moyer
Cc: Ingo Molnar
Cc: Christoph Hellwig
Cc: Toshi Kani
Cc: Oliver O'Halloran
Cc: Ross Zwisler
Reviewed-by: Jan Kara
Signed-off-by: Dan Williams
16 Jun, 2017
4 commits
-
Starting with v1.2 labels, 'address abstractions' can be hinted via an
address abstraction id that implies an info-block format. The standard
address abstraction in the specification is the v2 format of the
Block-Translation-Table (BTT). Support for that is saved for a later
patch, for now we add support for the Linux supported address
abstractions BTT (v1), PFN, and DAX.The new 'holder_class' attribute for namespace devices is added for
tooling to specify the 'abstraction_guid' to store in the namespace label.
For v1.1 labels this field is undefined and any setting of
'holder_class' away from the default 'none' value will only have effect
until the driver is unloaded. Setting 'holder_class' requires that
whatever device tries to claim the namespace must be of the specified
class.Cc: Vishal Verma
Signed-off-by: Dan Williams -
Starting with the v1.2 definition of namespace labels, the isetcookie
field is populated and validated for blk-aperture namespaces. This adds
some safety against inadvertent copying of namespace labels from one
DIMM-device to another.Signed-off-by: Dan Williams
-
The type_guid refers to the "Address Range Type GUID" for the region
backing a namespace as defined the ACPI NFIT (NVDIMM Firmware Interface
Table). This 'type' identifier specifies an access mechanism for the
given namespace. This capability replaces the confusing usage of the
'NSLABEL_FLAG_LOCAL' flag to indicate a block-aperture-mode namespace.Signed-off-by: Dan Williams
-
Previously we only honored the lba size for blk-aperture mode
namespaces. For pmem namespaces the lba size was just assumed to be 512.
With the new v1.2 label definition and compatibility with other
operating environments, the ->lbasize property is now respected for pmem
namespaces.Cc: Ross Zwisler
Signed-off-by: Dan Williams