Eric Lee / smarc-fsl-linux-kernel

05 Jan, 2020

1 commit

689b2c656 libnvdimm/btt: fix variable 'rc' set but not used ... Browse Code »

[ Upstream commit 4e24e37d5313edca8b4ab86f240c046c731e28d6 ]

drivers/nvdimm/btt.c: In function 'btt_read_pg':
drivers/nvdimm/btt.c:1264:8: warning: variable 'rc' set but not used
[-Wunused-but-set-variable]
int rc;
^~

Add a ratelimited message in case a storm of errors is encountered.

Fixes: d9b83c756953 ("libnvdimm, btt: rework error clearing")
Signed-off-by: Qian Cai
Reviewed-by: Vishal Verma
Link: https://lore.kernel.org/r/1572530719-32161-1-git-send-email-cai@lca.pw
Signed-off-by: Dan Williams
Signed-off-by: Sasha Levin

Qian Cai
2020-01-05 02:18:12 +0800

30 Sep, 2019

1 commit

a3c0e7b1f Merge tag 'libnvdimm-fixes-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm ... Browse Code »

More libnvdimm updates from Dan Williams:

- Complete the reworks to interoperate with powerpc dynamic huge page
sizes

- Fix a crash due to missed accounting for the powerpc 'struct
page'-memmap mapping granularity

- Fix badblock initialization for volatile (DRAM emulated) pmem ranges

- Stop triggering request_key() notifications to userspace when
NVDIMM-security is disabled / not present

- Miscellaneous small fixups

* tag 'libnvdimm-fixes-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
libnvdimm/region: Enable MAP_SYNC for volatile regions
libnvdimm: prevent nvdimm from requesting key when security is disabled
libnvdimm/region: Initialize bad block for volatile namespaces
libnvdimm/nfit_test: Fix acpi_handle redefinition
libnvdimm/altmap: Track namespace boundaries in altmap
libnvdimm: Fix endian conversion issues
libnvdimm/dax: Pick the right alignment default when creating dax devices
powerpc/book3s64: Export has_transparent_hugepage() related functions.

Linus Torvalds
2019-09-30 01:33:41 +0800

25 Sep, 2019

6 commits

4c806b897 libnvdimm/region: Enable MAP_SYNC for volatile regions ... Browse Code »

Some environments want to use a host tmpfs/ramdisk to back guest pmem.
While the data is not persisted relative to the host it *is* persisted
relative to guest crashes / reboots. The guest is free to use dax and
MAP_SYNC to keep filesystem metadata consistent with dax accesses
without requiring guest fsync(). The guest can also observe that the
region is volatile and skip cache flushing as global visibility is
enough to "persist" data relative to the host staying alive over guest
reset events.

Signed-off-by: Aneesh Kumar K.V
Reviewed-by: Pankaj Gupta
Link: https://lore.kernel.org/r/20190924114327.14700-1-aneesh.kumar@linux.ibm.com
[djbw: reword the changelog]
Signed-off-by: Dan Williams

Aneesh Kumar K.V
2019-09-25 01:33:19 +0800
674f31a35 libnvdimm: prevent nvdimm from requesting key when security is disabled ... Browse Code »

Current implementation attempts to request keys from the keyring even when
security is not enabled. Change behavior so when security is disabled it
will skip key request.

Error messages seen when no keys are installed and libnvdimm is loaded:

request-key[4598]: Cannot find command to construct key 661489677
request-key[4606]: Cannot find command to construct key 34713726

Cc: stable@vger.kernel.org
Fixes: 4c6926a23b76 ("acpi/nfit, libnvdimm: Add unlock of nvdimm support for Intel DIMMs")
Signed-off-by: Dave Jiang
Link: https://lore.kernel.org/r/156934642272.30222.5230162488753445916.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dan Williams

Dave Jiang
2019-09-25 01:30:10 +0800
c42adf87e libnvdimm/region: Initialize bad block for volatile namespaces ... Browse Code »

We do check for a bad block during namespace init and that use
region bad block list. We need to initialize the bad block
for volatile regions for this to work. We also observe a lockdep
warning as below because the lock is not initialized correctly
since we skip bad block init for volatile regions.

INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.3.0-rc1-15699-g3dee241c937e #149
Call Trace:
[c0000000f95cb250] [c00000000147dd84] dump_stack+0xe8/0x164 (unreliable)
[c0000000f95cb2a0] [c00000000022ccd8] register_lock_class+0x308/0xa60
[c0000000f95cb3a0] [c000000000229cc0] __lock_acquire+0x170/0x1ff0
[c0000000f95cb4c0] [c00000000022c740] lock_acquire+0x220/0x270
[c0000000f95cb580] [c000000000a93230] badblocks_check+0xc0/0x290
[c0000000f95cb5f0] [c000000000d97540] nd_pfn_validate+0x5c0/0x7f0
[c0000000f95cb6d0] [c000000000d98300] nd_dax_probe+0xd0/0x1f0
[c0000000f95cb760] [c000000000d9b66c] nd_pmem_probe+0x10c/0x160
[c0000000f95cb790] [c000000000d7f5ec] nvdimm_bus_probe+0x10c/0x240
[c0000000f95cb820] [c000000000d0f844] really_probe+0x254/0x4e0
[c0000000f95cb8b0] [c000000000d0fdfc] driver_probe_device+0x16c/0x1e0
[c0000000f95cb930] [c000000000d10238] device_driver_attach+0x68/0xa0
[c0000000f95cb970] [c000000000d1040c] __driver_attach+0x19c/0x1c0
[c0000000f95cb9f0] [c000000000d0c4c4] bus_for_each_dev+0x94/0x130
[c0000000f95cba50] [c000000000d0f014] driver_attach+0x34/0x50
[c0000000f95cba70] [c000000000d0e208] bus_add_driver+0x178/0x2f0
[c0000000f95cbb00] [c000000000d117c8] driver_register+0x108/0x170
[c0000000f95cbb70] [c000000000d7edb0] __nd_driver_register+0xe0/0x100
[c0000000f95cbbd0] [c000000001a6baa4] nd_pmem_driver_init+0x34/0x48
[c0000000f95cbbf0] [c0000000000106f4] do_one_initcall+0x1d4/0x4b0
[c0000000f95cbcd0] [c0000000019f499c] kernel_init_freeable+0x544/0x65c
[c0000000f95cbdb0] [c000000000010d6c] kernel_init+0x2c/0x180
[c0000000f95cbe20] [c00000000000b954] ret_from_kernel_thread+0x5c/0x68

Signed-off-by: Aneesh Kumar K.V
Link: https://lore.kernel.org/r/20190919083355.26340-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Dan Williams

Aneesh Kumar K.V
2019-09-25 01:27:51 +0800
cf387d964 libnvdimm/altmap: Track namespace boundaries in altmap ... Browse Code »

With PFN_MODE_PMEM namespace, the memmap area is allocated from the device
area. Some architectures map the memmap area with large page size. On
architectures like ppc64, 16MB page for memap mapping can map 262144 pfns.
This maps a namespace size of 16G.

When populating memmap region with 16MB page from the device area,
make sure the allocated space is not used to map resources outside this
namespace. Such usage of device area will prevent a namespace destroy.

Add resource end pnf in altmap and use that to check if the memmap area
allocation can map pfn outside the namespace. On ppc64 in such case we fallback
to allocation from memory.

This fix kernel crash reported below:

[ 132.034989] WARNING: CPU: 13 PID: 13719 at mm/memremap.c:133 devm_memremap_pages_release+0x2d8/0x2e0
[ 133.464754] BUG: Unable to handle kernel data access at 0xc00c00010b204000
[ 133.464760] Faulting instruction address: 0xc00000000007580c
[ 133.464766] Oops: Kernel access of bad area, sig: 11 [#1]
[ 133.464771] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
.....
[ 133.464901] NIP [c00000000007580c] vmemmap_free+0x2ac/0x3d0
[ 133.464906] LR [c0000000000757f8] vmemmap_free+0x298/0x3d0
[ 133.464910] Call Trace:
[ 133.464914] [c000007cbfd0f7b0] [c0000000000757f8] vmemmap_free+0x298/0x3d0 (unreliable)
[ 133.464921] [c000007cbfd0f8d0] [c000000000370a44] section_deactivate+0x1a4/0x240
[ 133.464928] [c000007cbfd0f980] [c000000000386270] __remove_pages+0x3a0/0x590
[ 133.464935] [c000007cbfd0fa50] [c000000000074158] arch_remove_memory+0x88/0x160
[ 133.464942] [c000007cbfd0fae0] [c0000000003be8c0] devm_memremap_pages_release+0x150/0x2e0
[ 133.464949] [c000007cbfd0fb70] [c000000000738ea0] devm_action_release+0x30/0x50
[ 133.464955] [c000007cbfd0fb90] [c00000000073a5a4] release_nodes+0x344/0x400
[ 133.464961] [c000007cbfd0fc40] [c00000000073378c] device_release_driver_internal+0x15c/0x250
[ 133.464968] [c000007cbfd0fc80] [c00000000072fd14] unbind_store+0x104/0x110
[ 133.464973] [c000007cbfd0fcd0] [c00000000072ee24] drv_attr_store+0x44/0x70
[ 133.464981] [c000007cbfd0fcf0] [c0000000004a32bc] sysfs_kf_write+0x6c/0xa0
[ 133.464987] [c000007cbfd0fd10] [c0000000004a1dfc] kernfs_fop_write+0x17c/0x250
[ 133.464993] [c000007cbfd0fd60] [c0000000003c348c] __vfs_write+0x3c/0x70
[ 133.464999] [c000007cbfd0fd80] [c0000000003c75d0] vfs_write+0xd0/0x250

djbw: Aneesh notes that this crash can likely be triggered in any kernel that
supports 'papr_scm', so flagging that commit for -stable consideration.

Fixes: b5beae5e224f ("powerpc/pseries: Add driver for PAPR SCM regions")
Cc:
Reported-by: Sachin Sant
Signed-off-by: Aneesh Kumar K.V
Reviewed-by: Pankaj Gupta
Tested-by: Santosh Sivaraj
Reviewed-by: Johannes Thumshirn
Link: https://lore.kernel.org/r/20190910062826.10041-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Dan Williams

Aneesh Kumar K.V
2019-09-25 01:24:12 +0800
86aa66687 libnvdimm: Fix endian conversion issues ... Browse Code »

nd_label->dpa issue was observed when trying to enable the namespace created
with little-endian kernel on a big-endian kernel. That made me run
`sparse` on the rest of the code and other changes are the result of that.

Fixes: d9b83c756953 ("libnvdimm, btt: rework error clearing")
Fixes: 9dedc73a4658 ("libnvdimm/btt: Fix LBA masking during 'free list' population")
Reviewed-by: Vishal Verma
Signed-off-by: Aneesh Kumar K.V
Link: https://lore.kernel.org/r/20190809074726.27815-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Dan Williams

Aneesh Kumar K.V
2019-09-25 01:23:44 +0800
f53766997 libnvdimm/dax: Pick the right alignment default when creating dax devices ... Browse Code »

Allow arch to provide the supported alignments and use hugepage alignment only
if we support hugepage. Right now we depend on compile time configs whereas this
patch switch this to runtime discovery.

Architectures like ppc64 can have THP enabled in code, but then can have
hugepage size disabled by the hypervisor. This allows us to create dax devices
with PAGE_SIZE alignment in this case.

Existing dax namespace with alignment larger than PAGE_SIZE will fail to
initialize in this specific case. We still allow fsdax namespace initialization.

With respect to identifying whether to enable hugepage fault for a dax device,
if THP is enabled during compile, we default to taking hugepage fault and in dax
fault handler if we find the fault size > alignment we retry with PAGE_SIZE
fault size.

This also addresses the below failure scenario on ppc64

ndctl create-namespace --mode=devdax | grep align
"align":16777216,
"align":16777216

cat /sys/devices/ndbus0/region0/dax0.0/supported_alignments
65536 16777216

daxio.static-debug -z -o /dev/dax0.0
Bus error (core dumped)

$ dmesg | tail
lpar: Failed hash pte insert with error -4
hash-mmu: mm: Hashing failure ! EA=0x7fff17000000 access=0x8000000000000006 current=daxio
hash-mmu: trap=0x300 vsid=0x22cb7a3 ssize=1 base psize=2 psize 10 pte=0xc000000501002b86
daxio[3860]: bus error (7) at 7fff17000000 nip 7fff973c007c lr 7fff973bff34 code 2 in libpmem.so.1.0.0[7fff973b0000+20000]
daxio[3860]: code: 792945e4 7d494b78 e95f0098 7d494b78 f93f00a0 4800012c e93f0088 f93f0120
daxio[3860]: code: e93f00a0 f93f0128 e93f0120 e95f0128 e93f0088 39290008 f93f0110

The failure was due to guest kernel using wrong page size.

The namespaces created with 16M alignment will appear as below on a config with
16M page size disabled.

$ ndctl list -Ni
[
{
"dev":"namespace0.1",
"mode":"fsdax",
"map":"dev",
"size":5351931904,
"uuid":"fc6e9667-461a-4718-82b4-69b24570bddb",
"align":16777216,
"blockdev":"pmem0.1",
"supported_alignments":[
65536
]
},
{
"dev":"namespace0.0",
"mode":"fsdax",
Signed-off-by: Aneesh Kumar K.V
Link: https://lore.kernel.org/r/20190905154603.10349-8-aneesh.kumar@linux.ibm.com
Signed-off-by: Dan Williams

Aneesh Kumar K.V
2019-09-25 01:23:41 +0800

22 Sep, 2019

2 commits

6cb2e9ee5 Merge tag 'libnvdimm-for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm ... Browse Code »

Pull libnvdimm updates from Dan Williams:
"Some reworks to better support nvdimms on powerpc and an nvdimm
security interface update:

- Rework the nvdimm core to accommodate architectures with different
page sizes and ones that can change supported huge page sizes at
boot time rather than a compile time constant.

- Introduce a distinct 'frozen' attribute for the nvdimm security
state since it is independent of the locked state.

- Miscellaneous fixups"

* tag 'libnvdimm-for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
libnvdimm: Use PAGE_SIZE instead of SZ_4K for align check
libnvdimm/label: Remove the dpa align check
libnvdimm/pfn_dev: Add page size and struct page size to pfn superblock
libnvdimm/pfn_dev: Add a build check to make sure we notice when struct page size change
libnvdimm/pmem: Advance namespace seed for specific probe errors
libnvdimm/region: Rewrite _probe_success() to _advance_seeds()
libnvdimm/security: Consolidate 'security' operations
libnvdimm/security: Tighten scope of nvdimm->busy vs security operations
libnvdimm/security: Introduce a 'frozen' attribute
libnvdimm, region: Use struct_size() in kzalloc()
tools/testing/nvdimm: Fix fallthrough warning
libnvdimm/of_pmem: Provide a unique name for bus provider

Linus Torvalds
2019-09-22 01:55:29 +0800
84da111de Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma ... Browse Code »

Pull hmm updates from Jason Gunthorpe:
"This is more cleanup and consolidation of the hmm APIs and the very
strongly related mmu_notifier interfaces. Many places across the tree
using these interfaces are touched in the process. Beyond that a
cleanup to the page walker API and a few memremap related changes
round out the series:

- General improvement of hmm_range_fault() and related APIs, more
documentation, bug fixes from testing, API simplification &
consolidation, and unused API removal

- Simplify the hmm related kconfigs to HMM_MIRROR and DEVICE_PRIVATE,
and make them internal kconfig selects

- Hoist a lot of code related to mmu notifier attachment out of
drivers by using a refcount get/put attachment idiom and remove the
convoluted mmu_notifier_unregister_no_release() and related APIs.

- General API improvement for the migrate_vma API and revision of its
only user in nouveau

- Annotate mmu_notifiers with lockdep and sleeping region debugging

Two series unrelated to HMM or mmu_notifiers came along due to
dependencies:

- Allow pagemap's memremap_pages family of APIs to work without
providing a struct device

- Make walk_page_range() and related use a constant structure for
function pointers"

* tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (75 commits)
libnvdimm: Enable unit test infrastructure compile checks
mm, notifier: Catch sleeping/blocking for !blockable
kernel.h: Add non_block_start/end()
drm/radeon: guard against calling an unpaired radeon_mn_unregister()
csky: add missing brackets in a macro for tlb.h
pagewalk: use lockdep_assert_held for locking validation
pagewalk: separate function pointers from iterator data
mm: split out a new pagewalk.h header from mm.h
mm/mmu_notifiers: annotate with might_sleep()
mm/mmu_notifiers: prime lockdep
mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end
mm/mmu_notifiers: remove the __mmu_notifier_invalidate_range_start/end exports
mm/hmm: hmm_range_fault() infinite loop
mm/hmm: hmm_range_fault() NULL pointer bug
mm/hmm: fix hmm_range_fault()'s handling of swapped out pages
mm/mmu_notifiers: remove unregister_no_release
RDMA/odp: remove ib_ucontext from ib_umem
RDMA/odp: use mmu_notifier_get/put for 'struct ib_ucontext_per_mm'
RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr
RDMA/mlx5: Use ib_umem_start instead of umem.address
...

Linus Torvalds
2019-09-22 01:07:42 +0800

07 Sep, 2019

1 commit

62974fc38 libnvdimm: Enable unit test infrastructure compile checks ... Browse Code »

The infrastructure to mock core libnvdimm routines for unit testing
purposes is prone to bitrot relative to refactoring of that core. Arrange
for the unit test core to be built when CONFIG_COMPILE_TEST=y. This does
not result in a functional unit test environment, it is only a helper for
0day to catch unit test build regressions.

Note that there are a few x86isms in the implementation, so this does not
bother compile testing this architectures other than 64-bit x86.

Link: https://lore.kernel.org/r/156763690875.2556198.15786177395425033830.stgit@dwillia2-desk3.amr.corp.intel.com
Reported-by: Christoph Hellwig
Signed-off-by: Dan Williams
Signed-off-by: Jason Gunthorpe

Dan Williams
2019-09-07 15:28:05 +0800

06 Sep, 2019

6 commits

5b26db95f libnvdimm: Use PAGE_SIZE instead of SZ_4K for align check ... Browse Code »

Architectures have different page size than 4K. Use the PAGE_SIZE
to make sure ranges are correctly aligned.

Signed-off-by: Aneesh Kumar K.V
Link: https://lore.kernel.org/r/20190905154603.10349-7-aneesh.kumar@linux.ibm.com
Signed-off-by: Dan Williams

Aneesh Kumar K.V
2019-09-06 07:11:14 +0800
047e0eff1 libnvdimm/label: Remove the dpa align check ... Browse Code »

There's no strict requirement why slot_valid() needs to check for page alignment
and it would seem to actively hurt cross-page-size compatibility. Let's
delete the check and rely on checksum validation.

Signed-off-by: Aneesh Kumar K.V
Link: https://lore.kernel.org/r/20190905154603.10349-6-aneesh.kumar@linux.ibm.com
Signed-off-by: Dan Williams

Aneesh Kumar K.V
2019-09-06 07:11:14 +0800
edbb52c24 libnvdimm/pfn_dev: Add page size and struct page size to pfn superblock ... Browse Code »

This is needed so that pmem probe don't wrongly initialize a namespace
which doesn't have enough space reserved for holding struct pages
with the current kernel.

Signed-off-by: Aneesh Kumar K.V
Link: https://lore.kernel.org/r/20190905154603.10349-5-aneesh.kumar@linux.ibm.com
Signed-off-by: Dan Williams

Aneesh Kumar K.V
2019-09-06 07:11:14 +0800
e96f0bf2e libnvdimm/pfn_dev: Add a build check to make sure we notice when struct page size change ... Browse Code »

Namespaces created with PFN_MODE_PMEM mode stores struct page in the reserve
block area. We need to make sure we account for the right struct page
size while doing this. Instead of directly depending on sizeof(struct page)
which can change based on different kernel config option, use the max struct
page size (64) while calculating the reserve block area. This makes sure pmem
device can be used across kernels built with different configs.

If the above assumption of max struct page size change, we need to update the
reserve block allocation space for new namespaces created.

Signed-off-by: Aneesh Kumar K.V
Link: https://lore.kernel.org/r/20190905154603.10349-4-aneesh.kumar@linux.ibm.com
Signed-off-by: Dan Williams

Aneesh Kumar K.V
2019-09-06 07:11:14 +0800
1c97afa71 libnvdimm/pmem: Advance namespace seed for specific probe errors ... Browse Code »

In order to support marking namespaces with unsupported feature/versions
disabled, nvdimm core should advance the namespace seed on these
probe failures. Otherwise, these failed namespaces will be considered a
seed namespace and will be wrongly used while creating new namespaces.

Add -EOPNOTSUPP as return from pmem probe callback to indicate a namespace
initialization failures due to pfn superblock feature/version mismatch.

Signed-off-by: Aneesh Kumar K.V
Link: https://lore.kernel.org/r/20190905154603.10349-3-aneesh.kumar@linux.ibm.com
Signed-off-by: Dan Williams

Aneesh Kumar K.V
2019-09-06 07:11:14 +0800
a2d1c7a61 libnvdimm/region: Rewrite _probe_success() to _advance_seeds() ... Browse Code »

The nd_region_probe_success() helper collides seed management with
nvdimm->busy tracking. Given the 'busy' increment is handled internal to the
nd_region driver 'probe' path move the decrement to the 'remove' path.
With that cleanup the routine can be renamed to the more descriptive
nd_region_advance_seeds().

The change is prompted by an incoming need to optionally advance the
seeds on other events besides 'probe' success.

Cc: "Aneesh Kumar K.V"
Signed-off-by: Dan Williams
Signed-off-by: Aneesh Kumar K.V
Link: https://lore.kernel.org/r/20190905154603.10349-2-aneesh.kumar@linux.ibm.com
Signed-off-by: Dan Williams

Dan Williams
2019-09-06 07:11:14 +0800

30 Aug, 2019

4 commits

7b60422cb libnvdimm/security: Consolidate 'security' operations ... Browse Code »

The security operations are exported from libnvdimm/security.c to
libnvdimm/dimm_devs.c, and libnvdimm/security.c is optionally compiled
based on the CONFIG_NVDIMM_KEYS config symbol.

Rather than export the operations across compile objects, just move the
__security_store() entry point to live with the helpers.

Acked-by: Jeff Moyer
Reviewed-by: Dave Jiang
Link: https://lore.kernel.org/r/156686730515.184120.10522747907309996674.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams

Dan Williams
2019-08-30 04:51:57 +0800
bc4f2199c libnvdimm/security: Tighten scope of nvdimm->busy vs security operations ... Browse Code »

An attempt to freeze DIMMs currently runs afoul of default blocking of
all security operations in the entry to the 'store' routine for the
'security' sysfs attribute.

The blanket blocking of all security operations while the DIMM is in
active use in a region is too restrictive. The only security operations
that need to be aware of the ->busy state are those that mutate the
state of data, i.e. erase and overwrite.

Refactor the ->busy checks to be applied at the entry common entry point
in __security_store() rather than each of the helper routines to enable
freeze to be run regardless of busy state.

Reviewed-by: Dave Jiang
Reviewed-by: Jeff Moyer
Link: https://lore.kernel.org/r/156686729996.184120.3458026302402493937.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams

Dan Williams
2019-08-30 04:51:57 +0800
d78c620a2 libnvdimm/security: Introduce a 'frozen' attribute ... Browse Code »

In the process of debugging a system with an NVDIMM that was failing to
unlock it was found that the kernel is reporting 'locked' while the DIMM
security interface is 'frozen'. Unfortunately the security state is
tracked internally as an enum which prevents it from communicating the
difference between 'locked' and 'locked + frozen'. It follows that the
enum also prevents the kernel from communicating 'unlocked + frozen'
which would be useful for debugging why security operations like 'change
passphrase' are disabled.

Ditch the security state enum for a set of flags and introduce a new
sysfs attribute explicitly for the 'frozen' state. The regression risk
is low because the 'frozen' state was already blocked behind the
'locked' state, but will need to revisit if there were cases where
applications need 'frozen' to show up in the primary 'security'
attribute. The expectation is that communicating 'frozen' is mostly a
helper for debug and status monitoring.

Reviewed-by: Dave Jiang
Reported-by: Jeff Moyer
Reviewed-by: Jeff Moyer
Link: https://lore.kernel.org/r/156686729474.184120.5835135644278860826.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams

Dan Williams
2019-08-30 04:49:13 +0800
2b90cb223 libnvdimm, region: Use struct_size() in kzalloc() ... Browse Code »

One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:

struct nd_region {
...
struct nd_mapping mapping[0];
};

instance = kzalloc(sizeof(struct nd_region) + sizeof(struct nd_mapping) *
count, GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:

instance = kzalloc(struct_size(instance, mapping, count), GFP_KERNEL);

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva
Reviewed-by: Vishal Verma
Reviewed-by: Kees Cook
Link: https://lore.kernel.org/r/20190610210613.GA21989@embeddedor
Signed-off-by: Dan Williams

Gustavo A. R. Silva
2019-08-30 04:49:00 +0800

29 Aug, 2019

1 commit

274b92408 libnvdimm/pfn: Fix namespace creation on misaligned addresses ... Browse Code »

Yi reported[1] that after commit a3619190d62e ("libnvdimm/pfn: stop
padding pmem namespaces to section alignment"), it was no longer
possible to create a device dax namespace with a 1G alignment. The
reason was that the pmem region was not itself 1G-aligned. The code
happily skips past the first 512M, but fails to account for a now
misaligned end offset (since space was allocated starting at that
misaligned address, and extending for size GBs). Reintroduce
end_trunc, so that the code correctly handles the misaligned end
address. This results in the same behavior as before the introduction
of the offending commit.

[1] https://lists.01.org/pipermail/linux-nvdimm/2019-July/022813.html

Fixes: a3619190d62e ("libnvdimm/pfn: stop padding pmem namespaces ...")
Reported-and-tested-by: Yi Zhang
Signed-off-by: Jeff Moyer
Link: https://lore.kernel.org/r/x49ftll8f39.fsf@segfault.boston.devel.redhat.com
Signed-off-by: Dan Williams

Jeff Moyer
2019-08-29 01:33:13 +0800

14 Aug, 2019

1 commit

49bddc73d libnvdimm/of_pmem: Provide a unique name for bus provider ... Browse Code »

ndctl binaries, v66 and older, mistakenly require the ndbus to have
unique names. If not while enumerating the bus in userspace it drops bus
with similar names. This results in us not listing devices beneath the
bus.

Signed-off-by: Aneesh Kumar K.V
Tested-by: Vaibhav Jain
Link: https://lore.kernel.org/r/20190807040029.11344-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Dan Williams

Aneesh Kumar K.V
2019-08-14 11:31:57 +0800

27 Jul, 2019

1 commit

523634db1 Merge tag 'libnvdimm-fixes-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm ... Browse Code »

Pull libnvdimm fixes from Dan Williams:
"A collection of locking and async operations fixes for v5.3-rc2. These
had been soaking in a branch targeting the merge window, but missed
due to a regression hunt. This fixed up version has otherwise been in
-next this past week with no reported issues.

In order to gain confidence in the locking changes the pull also
includes a debug / instrumentation patch to enable lockdep coverage
for libnvdimm subsystem operations that depend on the device_lock for
exclusion. As mentioned in the changelog it is a hack, but it works
and documents the locking expectations of the sub-system in a way that
others can use lockdep to verify. The driver core touches got an ack
from Greg.

Summary:

- Fix duplicate device_unregister() calls (multiple threads competing
to do unregister work when scheduling device removal from a sysfs
attribute of the self-same device).

- Fix badblocks registration order bug. Ensure region badblocks are
initialized in advance of namespace registration.

- Fix a deadlock between the bus lock and probe operations.

- Export device-core infrastructure to coordinate async operations
via the device ->dead state.

- Add device-core infrastructure to validate device_lock() usage with
lockdep"

* tag 'libnvdimm-fixes-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
driver-core, libnvdimm: Let device subsystems add local lockdep coverage
libnvdimm/bus: Fix wait_nvdimm_bus_probe_idle() ABBA deadlock
libnvdimm/bus: Stop holding nvdimm_bus_list_mutex over __nd_ioctl()
libnvdimm/bus: Prepare the nd_ioctl() path to be re-entrant
libnvdimm/region: Register badblocks before namespaces
libnvdimm/bus: Prevent duplicate device_unregister() calls
drivers/base: Introduce kill_device()

Linus Torvalds
2019-07-27 23:25:51 +0800

20 Jul, 2019

1 commit

249be8511 Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge yet more updates from Andrew Morton:
"The rest of MM and a kernel-wide procfs cleanup.

Summary of the more significant patches:

- Patch series "mm/memory_hotplug: Factor out memory block
devicehandling", v3. David Hildenbrand.

Some spring-cleaning of the memory hotplug code, notably in
drivers/base/memory.c

- "mm: thp: fix false negative of shmem vma's THP eligibility". Yang
Shi.

Fix /proc/pid/smaps output for THP pages used in shmem.

- "resource: fix locking in find_next_iomem_res()" + 1. Nadav Amit.

Bugfix and speedup for kernel/resource.c

- Patch series "mm: Further memory block device cleanups", David
Hildenbrand.

More spring-cleaning of the memory hotplug code.

- Patch series "mm: Sub-section memory hotplug support". Dan
Williams.

Generalise the memory hotplug code so that pmem can use it more
completely. Then remove the hacks from the libnvdimm code which
were there to work around the memory-hotplug code's constraints.

- "proc/sysctl: add shared variables for range check", Matteo Croce.

We have about 250 instances of

int zero;
...
.extra1 = &zero,

in the tree. This is a tree-wide sweep to make all those private
"zero"s and "one"s use global variables.

Alas, it isn't practical to make those two global integers const"

* emailed patches from Andrew Morton : (38 commits)
proc/sysctl: add shared variables for range check
mm: migrate: remove unused mode argument
mm/sparsemem: cleanup 'section number' data types
libnvdimm/pfn: stop padding pmem namespaces to section alignment
libnvdimm/pfn: fix fsdax-mode namespace info-block zero-fields
mm/devm_memremap_pages: enable sub-section remap
mm: document ZONE_DEVICE memory-model implications
mm/sparsemem: support sub-section hotplug
mm/sparsemem: prepare for sub-section ranges
mm: kill is_dev_zone() helper
mm/hotplug: kill is_dev_zone() usage in __remove_pages()
mm/sparsemem: convert kmalloc_section_memmap() to populate_section_memmap()
mm/hotplug: prepare shrink_{zone, pgdat}_span for sub-section removal
mm/sparsemem: add helpers track active portions of a section at boot
mm/sparsemem: introduce a SECTION_IS_EARLY flag
mm/sparsemem: introduce struct mem_section_usage
drivers/base/memory.c: get rid of find_memory_block_hinted()
mm/memory_hotplug: move and simplify walk_memory_blocks()
mm/memory_hotplug: rename walk_memory_range() and pass start+size instead of pfns
mm: make register_mem_sect_under_node() static
...

Linus Torvalds
2019-07-20 00:45:58 +0800

19 Jul, 2019

9 commits

a3619190d libnvdimm/pfn: stop padding pmem namespaces to section alignment ... Browse Code »

Now that the mm core supports section-unaligned hotplug of ZONE_DEVICE
memory, we no longer need to add padding at pfn/dax device creation
time. The kernel will still honor padding established by older kernels.

Link: http://lkml.kernel.org/r/156092356588.979959.6793371748950931916.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams
Reported-by: Jeff Moyer
Tested-by: Aneesh Kumar K.V [ppc64]
Cc: David Hildenbrand
Cc: Jane Chu
Cc: Jérôme Glisse
Cc: Jonathan Corbet
Cc: Logan Gunthorpe
Cc: Michal Hocko
Cc: Mike Rapoport
Cc: Oscar Salvador
Cc: Pavel Tatashin
Cc: Toshi Kani
Cc: Vlastimil Babka
Cc: Wei Yang
Cc: Jason Gunthorpe
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dan Williams
2019-07-19 08:08:07 +0800
7e3e888df libnvdimm/pfn: fix fsdax-mode namespace info-block zero-fields ... Browse Code »

At namespace creation time there is the potential for the "expected to
be zero" fields of a 'pfn' info-block to be filled with indeterminate
data. While the kernel buffer is zeroed on allocation it is immediately
overwritten by nd_pfn_validate() filling it with the current contents of
the on-media info-block location. For fields like, 'flags' and the
'padding' it potentially means that future implementations can not rely on
those fields being zero.

In preparation to stop using the 'start_pad' and 'end_trunc' fields for
section alignment, arrange for fields that are not explicitly
initialized to be guaranteed zero. Bump the minor version to indicate
it is safe to assume the 'padding' and 'flags' are zero. Otherwise,
this corruption is expected to benign since all other critical fields
are explicitly initialized.

Note The cc: stable is about spreading this new policy to as many
kernels as possible not fixing an issue in those kernels. It is not
until the change titled "libnvdimm/pfn: Stop padding pmem namespaces to
section alignment" where this improper initialization becomes a problem.
So if someone decides to backport "libnvdimm/pfn: Stop padding pmem
namespaces to section alignment" (which is not tagged for stable), make
sure this pre-requisite is flagged.

Link: http://lkml.kernel.org/r/156092356065.979959.6681003754765958296.stgit@dwillia2-desk3.amr.corp.intel.com
Fixes: 32ab0a3f5170 ("libnvdimm, pmem: 'struct page' for pmem")
Signed-off-by: Dan Williams
Tested-by: Aneesh Kumar K.V [ppc64]
Cc:
Cc: David Hildenbrand
Cc: Jane Chu
Cc: Jeff Moyer
Cc: Jérôme Glisse
Cc: Jonathan Corbet
Cc: Logan Gunthorpe
Cc: Michal Hocko
Cc: Mike Rapoport
Cc: Oscar Salvador
Cc: Pavel Tatashin
Cc: Toshi Kani
Cc: Vlastimil Babka
Cc: Wei Yang
Cc: Jason Gunthorpe
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dan Williams
2019-07-19 08:08:07 +0800
87a30e1f0 driver-core, libnvdimm: Let device subsystems add local lockdep coverage ... Browse Code »

For good reason, the standard device_lock() is marked
lockdep_set_novalidate_class() because there is simply no sane way to
describe the myriad ways the device_lock() ordered with other locks.
However, that leaves subsystems that know their own local device_lock()
ordering rules to find lock ordering mistakes manually. Instead,
introduce an optional / additional lockdep-enabled lock that a subsystem
can acquire in all the same paths that the device_lock() is acquired.

A conversion of the NFIT driver and NVDIMM subsystem to a
lockdep-validate device_lock() scheme is included. The
debug_nvdimm_lock() implementation implements the correct lock-class and
stacking order for the libnvdimm device topology hierarchy.

Yes, this is a hack, but hopefully it is a useful hack for other
subsystems device_lock() debug sessions. Quoting Greg:

"Yeah, it feels a bit hacky but it's really up to a subsystem to mess up
using it as much as anything else, so user beware :)

I don't object to it if it makes things easier for you to debug."

Cc: Ingo Molnar
Cc: Ira Weiny
Cc: Will Deacon
Cc: Dave Jiang
Cc: Keith Busch
Cc: Peter Zijlstra
Cc: Vishal Verma
Cc: "Rafael J. Wysocki"
Cc: Greg Kroah-Hartman
Signed-off-by: Dan Williams
Acked-by: Greg Kroah-Hartman
Reviewed-by: Ira Weiny
Link: https://lore.kernel.org/r/156341210661.292348.7014034644265455704.stgit@dwillia2-desk3.amr.corp.intel.com

Dan Williams
2019-07-19 07:23:27 +0800
ca6bf264f libnvdimm/bus: Fix wait_nvdimm_bus_probe_idle() ABBA deadlock ... Browse Code »

A multithreaded namespace creation/destruction stress test currently
deadlocks with the following lockup signature:

INFO: task ndctl:2924 blocked for more than 122 seconds.
Tainted: G OE 5.2.0-rc4+ #3382
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ndctl D 0 2924 1176 0x00000000
Call Trace:
? __schedule+0x27e/0x780
schedule+0x30/0xb0
wait_nvdimm_bus_probe_idle+0x8a/0xd0 [libnvdimm]
? finish_wait+0x80/0x80
uuid_store+0xe6/0x2e0 [libnvdimm]
kernfs_fop_write+0xf0/0x1a0
vfs_write+0xb7/0x1b0
ksys_write+0x5c/0xd0
do_syscall_64+0x60/0x240

INFO: task ndctl:2923 blocked for more than 122 seconds.
Tainted: G OE 5.2.0-rc4+ #3382
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ndctl D 0 2923 1175 0x00000000
Call Trace:
? __schedule+0x27e/0x780
? __mutex_lock+0x489/0x910
schedule+0x30/0xb0
schedule_preempt_disabled+0x11/0x20
__mutex_lock+0x48e/0x910
? nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm]
? __lock_acquire+0x23f/0x1710
? nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm]
nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm]
__dax_pmem_probe+0x5e/0x210 [dax_pmem_core]
? nvdimm_bus_probe+0x1d0/0x2c0 [libnvdimm]
dax_pmem_probe+0xc/0x20 [dax_pmem]
nvdimm_bus_probe+0x90/0x2c0 [libnvdimm]
really_probe+0xef/0x390
driver_probe_device+0xb4/0x100

In this sequence an 'nd_dax' device is being probed and trying to take
the lock on its backing namespace to validate that the 'nd_dax' device
indeed has exclusive access to the backing namespace. Meanwhile, another
thread is trying to update the uuid property of that same backing
namespace. So one thread is in the probe path trying to acquire the
lock, and the other thread has acquired the lock and tries to flush the
probe path.

Fix this deadlock by not holding the namespace device_lock over the
wait_nvdimm_bus_probe_idle() synchronization step. In turn this requires
the device_lock to be held on entry to wait_nvdimm_bus_probe_idle() and
subsequently dropped internally to wait_nvdimm_bus_probe_idle().

Cc:
Fixes: bf9bccc14c05 ("libnvdimm: pmem label sets and namespace instantiation")
Cc: Vishal Verma
Tested-by: Jane Chu
Link: https://lore.kernel.org/r/156341210094.292348.2384694131126767789.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams

Dan Williams
2019-07-19 07:23:16 +0800
b70d31d05 libnvdimm/bus: Stop holding nvdimm_bus_list_mutex over __nd_ioctl() ... Browse Code »

In preparation for fixing a deadlock between wait_for_bus_probe_idle()
and the nvdimm_bus_list_mutex arrange for __nd_ioctl() without
nvdimm_bus_list_mutex held. This also unifies the 'dimm' and 'bus' level
ioctls into a common nd_ioctl() preamble implementation.

Marked for -stable as it is a pre-requisite for a follow-on fix.

Cc:
Fixes: bf9bccc14c05 ("libnvdimm: pmem label sets and namespace instantiation")
Cc: Vishal Verma
Tested-by: Jane Chu
Link: https://lore.kernel.org/r/156341209518.292348.7183897251740665198.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams

Dan Williams
2019-07-19 07:23:02 +0800
6de5d06e6 libnvdimm/bus: Prepare the nd_ioctl() path to be re-entrant ... Browse Code »

In preparation for not holding a lock over the execution of nd_ioctl(),
update the implementation to allow multiple threads to be attempting
ioctls at the same time. The bus lock still prevents multiple in-flight
->ndctl() invocations from corrupting each other's state, but static
global staging buffers are moved to the heap.

Reported-by: Vishal Verma
Reviewed-by: Vishal Verma
Tested-by: Vishal Verma
Link: https://lore.kernel.org/r/156341208947.292348.10560140326807607481.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams

Dan Williams
2019-07-19 07:22:35 +0800
700cd033a libnvdimm/region: Register badblocks before namespaces ... Browse Code »

Namespace activation expects to be able to reference region badblocks.
The following warning sometimes triggers when asynchronous namespace
activation races in front of the completion of namespace probing. Move
all possible namespace probing after region badblocks initialization.

Otherwise, lockdep sometimes catches the uninitialized state of the
badblocks seqlock with stack trace signatures like:

INFO: trying to register non-static key.
pmem2: detected capacity change from 0 to 136365211648
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 9 PID: 358 Comm: kworker/u80:5 Tainted: G OE 5.2.0-rc4+ #3382
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
Workqueue: events_unbound async_run_entry_fn
Call Trace:
dump_stack+0x85/0xc0
pmem1.12: detected capacity change from 0 to 8589934592
register_lock_class+0x56a/0x570
? check_object+0x140/0x270
__lock_acquire+0x80/0x1710
? __mutex_lock+0x39d/0x910
lock_acquire+0x9e/0x180
? nd_pfn_validate+0x28f/0x440 [libnvdimm]
badblocks_check+0x93/0x1f0
? nd_pfn_validate+0x28f/0x440 [libnvdimm]
nd_pfn_validate+0x28f/0x440 [libnvdimm]
? lockdep_hardirqs_on+0xf0/0x180
nd_dax_probe+0x9a/0x120 [libnvdimm]
nd_pmem_probe+0x6d/0x180 [nd_pmem]
nvdimm_bus_probe+0x90/0x2c0 [libnvdimm]

Fixes: 48af2f7e52f4 ("libnvdimm, pfn: during init, clear errors...")
Cc:
Cc: Vishal Verma
Reviewed-by: Vishal Verma
Link: https://lore.kernel.org/r/156341208365.292348.1547528796026249120.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams

Dan Williams
2019-07-19 07:22:19 +0800
8aac0e233 libnvdimm/bus: Prevent duplicate device_unregister() calls ... Browse Code »

A multithreaded namespace creation/destruction stress test currently
fails with signatures like the following:

sysfs group 'power' not found for kobject 'dax1.1'
RIP: 0010:sysfs_remove_group+0x76/0x80
Call Trace:
device_del+0x73/0x370
device_unregister+0x16/0x50
nd_async_device_unregister+0x1e/0x30 [libnvdimm]
async_run_entry_fn+0x39/0x160
process_one_work+0x23c/0x5e0
worker_thread+0x3c/0x390

BUG: kernel NULL pointer dereference, address: 0000000000000020
RIP: 0010:klist_put+0x1b/0x6c
Call Trace:
klist_del+0xe/0x10
device_del+0x8a/0x2c9
? __switch_to_asm+0x34/0x70
? __switch_to_asm+0x40/0x70
device_unregister+0x44/0x4f
nd_async_device_unregister+0x22/0x2d [libnvdimm]
async_run_entry_fn+0x47/0x15a
process_one_work+0x1a2/0x2eb
worker_thread+0x1b8/0x26e

Use the kill_device() helper to atomically resolve the race of multiple
threads issuing kill, device_unregister(), requests.

Reported-by: Jane Chu
Reported-by: Erwin Tsaur
Fixes: 4d88a97aa9e8 ("libnvdimm, nvdimm: dimm driver and base libnvdimm device-driver...")
Cc:
Link: https://github.com/pmem/ndctl/issues/96
Tested-by: Tested-by: Jane Chu
Link: https://lore.kernel.org/r/156341207846.292348.10435719262819764054.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams

Dan Williams
2019-07-19 07:21:34 +0800
f8c3500cd Merge tag 'libnvdimm-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm ... Browse Code »

Pull libnvdimm updates from Dan Williams:
"Primarily just the virtio_pmem driver:

- virtio_pmem

The new virtio_pmem facility introduces a paravirtualized
persistent memory device that allows a guest VM to use DAX
mechanisms to access a host-file with host-page-cache. It arranges
for MAP_SYNC to be disabled and instead triggers a host fsync()
when a 'write-cache flush' command is sent to the virtual disk
device.

- Miscellaneous small fixups"

* tag 'libnvdimm-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
virtio_pmem: fix sparse warning
xfs: disable map_sync for async flush
ext4: disable map_sync for async flush
dax: check synchronous mapping is supported
dm: enable synchronous dax
libnvdimm: add dax_dev sync flag
virtio-pmem: Add virtio pmem driver
libnvdimm: nd_region flush callback support
libnvdimm, namespace: Drop uuid_t implementation detail

Linus Torvalds
2019-07-19 01:52:08 +0800

17 Jul, 2019

1 commit

8c2e408e7 virtio_pmem: fix sparse warning ... Browse Code »

This patch fixes below sparse warning related to __virtio
type in virtio pmem driver. This is reported by Intel test
bot on linux-next tree.

nd_virtio.c:56:28: warning: incorrect type in assignment
(different base types)
nd_virtio.c:56:28: expected unsigned int [unsigned] [usertype] type
nd_virtio.c:56:28: got restricted __virtio32
nd_virtio.c:93:59: warning: incorrect type in argument 2
(different base types)
nd_virtio.c:93:59: expected restricted __virtio32 [usertype] val
nd_virtio.c:93:59: got unsigned int [unsigned] [usertype] ret

Reported-by: kbuild test robot
Signed-off-by: Pankaj Gupta
Acked-by: Michael S. Tsirkin
Signed-off-by: Dan Williams

Pankaj Gupta
2019-07-17 10:44:26 +0800

15 Jul, 2019

2 commits

ae4a05027 docs: nvdimm: add it to the driver-api book ... Browse Code »

The descriptions here are from Kernel driver's PoV.

Signed-off-by: Mauro Carvalho Chehab
Acked-by: Dan Williams

Mauro Carvalho Chehab
2019-07-15 20:20:27 +0800
b0a4aa950 docs: nvdimm: convert to ReST ... Browse Code »

Rename the nvdimm documentation files to ReST, add an
index for them and adjust in order to produce a nice html
output via the Sphinx build system.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.

Signed-off-by: Mauro Carvalho Chehab
Acked-by: Dan Williams

Mauro Carvalho Chehab
2019-07-15 20:20:25 +0800

06 Jul, 2019

3 commits

fefc1d97f libnvdimm: add dax_dev sync flag ... Browse Code »

This patch adds 'DAXDEV_SYNC' flag which is set
for nd_region doing synchronous flush. This later
is used to disable MAP_SYNC functionality for
ext4 & xfs filesystem for devices don't support
synchronous flush.

Signed-off-by: Pankaj Gupta
Signed-off-by: Dan Williams

Pankaj Gupta
2019-07-06 06:19:10 +0800
6e84200c0 virtio-pmem: Add virtio pmem driver ... Browse Code »

This patch adds virtio-pmem driver for KVM guest.

Guest reads the persistent memory range information from
Qemu over VIRTIO and registers it on nvdimm_bus. It also
creates a nd_region object with the persistent memory
range information so that existing 'nvdimm/pmem' driver
can reserve this into system memory map. This way
'virtio-pmem' driver uses existing functionality of pmem
driver to register persistent memory compatible for DAX
capable filesystems.

This also provides function to perform guest flush over
VIRTIO from 'pmem' driver when userspace performs flush
on DAX memory range.

Signed-off-by: Pankaj Gupta
Reviewed-by: Yuval Shaia
Acked-by: Michael S. Tsirkin
Acked-by: Jakub Staron
Tested-by: Jakub Staron
Reviewed-by: Cornelia Huck
Signed-off-by: Dan Williams

Pankaj Gupta
2019-07-06 06:19:10 +0800
c5d4355d1 libnvdimm: nd_region flush callback support ... Browse Code »

This patch adds functionality to perform flush from guest
to host over VIRTIO. We are registering a callback based
on 'nd_region' type. virtio_pmem driver requires this special
flush function. For rest of the region types we are registering
existing flush function. Report error returned by host fsync
failure to userspace.

Signed-off-by: Pankaj Gupta
Signed-off-by: Dan Williams

Pankaj Gupta
2019-07-06 06:19:10 +0800