Eric Lee / smarc-fsl-linux-kernel

30 Dec, 2020

2 commits

16ecf8cec libnvdimm/namespace: Fix reaping of invalidated block-window-namespace labels ... Browse Code »

commit 2dd2a1740ee19cd2636d247276cf27bfa434b0e2 upstream.

A recent change to ndctl to attempt to reconfigure namespaces in place
uncovered a label accounting problem in block-window-type namespaces.
The ndctl "create.sh" test is able to trigger this signature:

WARNING: CPU: 34 PID: 9167 at drivers/nvdimm/label.c:1100 __blk_label_update+0x9a3/0xbc0 [libnvdimm]
[..]
RIP: 0010:__blk_label_update+0x9a3/0xbc0 [libnvdimm]
[..]
Call Trace:
uuid_store+0x21b/0x2f0 [libnvdimm]
kernfs_fop_write+0xcf/0x1c0
vfs_write+0xcc/0x380
ksys_write+0x68/0xe0

When allocated capacity for a namespace is renamed (new UUID) the labels
with the old UUID need to be deleted. The ndctl behavior to always
destroy namespaces on reconfiguration hid this problem.

The immediate impact of this bug is limited since block-window-type
namespaces only seem to exist in the specification and not in any
shipping products. However, the label handling code is being reused for
other technologies like CXL region labels, so there is a benefit to
making sure both vertical labels sets (block-window) and horizontal
label sets (pmem) have a functional reference implementation in
libnvdimm.

Fixes: c4703ce11c23 ("libnvdimm/namespace: Fix label tracking error")
Cc:
Cc: Vishal Verma
Cc: Dave Jiang
Cc: Ira Weiny
Signed-off-by: Dan Williams
Signed-off-by: Greg Kroah-Hartman

Dan Williams
2020-12-30 18:54:27 +0800
0572a4aa7 libnvdimm/label: Return -ENXIO for no slot in __blk_label_update ... Browse Code »

[ Upstream commit 4c46764733c85b82c07e9559b39da4d00a7dd659 ]

Forget to set error code when nd_label_alloc_slot failed, and we
add it to avoid overwritten error code.

Fixes: 0ba1c634892b ("libnvdimm: write blk label set")
Signed-off-by: Zhang Qilong
Link: https://lore.kernel.org/r/20201205115056.2076523-1-zhangqilong3@huawei.com
Signed-off-by: Dan Williams
Signed-off-by: Sasha Levin

Zhang Qilong
2020-12-30 18:53:58 +0800

14 Oct, 2020

3 commits

b7b3c01b1 mm/memremap_pages: support multiple ranges per invocation ... Browse Code »

In support of device-dax growing the ability to front physically
dis-contiguous ranges of memory, update devm_memremap_pages() to track
multiple ranges with a single reference counter and devm instance.

Convert all [devm_]memremap_pages() users to specify the number of ranges
they are mapping in their 'struct dev_pagemap' instance.

Signed-off-by: Dan Williams
Signed-off-by: Andrew Morton
Cc: Paul Mackerras
Cc: Michael Ellerman
Cc: Benjamin Herrenschmidt
Cc: Vishal Verma
Cc: Vivek Goyal
Cc: Dave Jiang
Cc: Ben Skeggs
Cc: David Airlie
Cc: Daniel Vetter
Cc: Ira Weiny
Cc: Bjorn Helgaas
Cc: Boris Ostrovsky
Cc: Juergen Gross
Cc: Stefano Stabellini
Cc: "Jérôme Glisse"
Cc: Ard Biesheuvel
Cc: Ard Biesheuvel
Cc: Borislav Petkov
Cc: Brice Goglin
Cc: Catalin Marinas
Cc: Dave Hansen
Cc: David Hildenbrand
Cc: Greg Kroah-Hartman
Cc: "H. Peter Anvin"
Cc: Hulk Robot
Cc: Ingo Molnar
Cc: Jason Gunthorpe
Cc: Jason Yan
Cc: Jeff Moyer
Cc: "Jérôme Glisse"
Cc: Jia He
Cc: Joao Martins
Cc: Jonathan Cameron
Cc: kernel test robot
Cc: Mike Rapoport
Cc: Pavel Tatashin
Cc: Peter Zijlstra
Cc: "Rafael J. Wysocki"
Cc: Randy Dunlap
Cc: Thomas Gleixner
Cc: Tom Lendacky
Cc: Wei Yang
Cc: Will Deacon
Link: https://lkml.kernel.org/r/159643103789.4062302.18426128170217903785.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lkml.kernel.org/r/160106116293.30709.13350662794915396198.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds

Dan Williams
2020-10-14 09:38:28 +0800
a4574f63e mm/memremap_pages: convert to 'struct range' ... Browse Code »

The 'struct resource' in 'struct dev_pagemap' is only used for holding
resource span information. The other fields, 'name', 'flags', 'desc',
'parent', 'sibling', and 'child' are all unused wasted space.

This is in preparation for introducing a multi-range extension of
devm_memremap_pages().

The bulk of this change is unwinding all the places internal to libnvdimm
that used 'struct resource' unnecessarily, and replacing instances of
'struct dev_pagemap'.res with 'struct dev_pagemap'.range.

P2PDMA had a minor usage of the resource flags field, but only to report
failures with "%pR". That is replaced with an open coded print of the
range.

[dan.carpenter@oracle.com: mm/hmm/test: use after free in dmirror_allocate_chunk()]
Link: https://lkml.kernel.org/r/20200926121402.GA7467@kadam

Signed-off-by: Dan Williams
Signed-off-by: Dan Carpenter
Signed-off-by: Andrew Morton
Reviewed-by: Boris Ostrovsky [xen]
Cc: Paul Mackerras
Cc: Michael Ellerman
Cc: Benjamin Herrenschmidt
Cc: Vishal Verma
Cc: Vivek Goyal
Cc: Dave Jiang
Cc: Ben Skeggs
Cc: David Airlie
Cc: Daniel Vetter
Cc: Ira Weiny
Cc: Bjorn Helgaas
Cc: Juergen Gross
Cc: Stefano Stabellini
Cc: "Jérôme Glisse"
Cc: Andy Lutomirski
Cc: Ard Biesheuvel
Cc: Ard Biesheuvel
Cc: Borislav Petkov
Cc: Brice Goglin
Cc: Catalin Marinas
Cc: Dave Hansen
Cc: David Hildenbrand
Cc: Greg Kroah-Hartman
Cc: "H. Peter Anvin"
Cc: Hulk Robot
Cc: Ingo Molnar
Cc: Jason Gunthorpe
Cc: Jason Yan
Cc: Jeff Moyer
Cc: Jia He
Cc: Joao Martins
Cc: Jonathan Cameron
Cc: kernel test robot
Cc: Mike Rapoport
Cc: Pavel Tatashin
Cc: Peter Zijlstra
Cc: "Rafael J. Wysocki"
Cc: Randy Dunlap
Cc: Thomas Gleixner
Cc: Tom Lendacky
Cc: Wei Yang
Cc: Will Deacon
Link: https://lkml.kernel.org/r/159643103173.4062302.768998885691711532.stgit@dwillia2-desk3.amr.corp.intel.com
Link: https://lkml.kernel.org/r/160106115761.30709.13539840236873663620.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds

Dan Williams
2020-10-14 09:38:28 +0800
3ad11d7ac Merge tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block ... Browse Code »

Pull block updates from Jens Axboe:

- Series of merge handling cleanups (Baolin, Christoph)

- Series of blk-throttle fixes and cleanups (Baolin)

- Series cleaning up BDI, seperating the block device from the
backing_dev_info (Christoph)

- Removal of bdget() as a generic API (Christoph)

- Removal of blkdev_get() as a generic API (Christoph)

- Cleanup of is-partition checks (Christoph)

- Series reworking disk revalidation (Christoph)

- Series cleaning up bio flags (Christoph)

- bio crypt fixes (Eric)

- IO stats inflight tweak (Gabriel)

- blk-mq tags fixes (Hannes)

- Buffer invalidation fixes (Jan)

- Allow soft limits for zone append (Johannes)

- Shared tag set improvements (John, Kashyap)

- Allow IOPRIO_CLASS_RT for CAP_SYS_NICE (Khazhismel)

- DM no-wait support (Mike, Konstantin)

- Request allocation improvements (Ming)

- Allow md/dm/bcache to use IO stat helpers (Song)

- Series improving blk-iocost (Tejun)

- Various cleanups (Geert, Damien, Danny, Julia, Tetsuo, Tian, Wang,
Xianting, Yang, Yufen, yangerkun)

* tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (191 commits)
block: fix uapi blkzoned.h comments
blk-mq: move cancel of hctx->run_work to the front of blk_exit_queue
blk-mq: get rid of the dead flush handle code path
block: get rid of unnecessary local variable
block: fix comment and add lockdep assert
blk-mq: use helper function to test hw stopped
block: use helper function to test queue register
block: remove redundant mq check
block: invoke blk_mq_exit_sched no matter whether have .exit_sched
percpu_ref: don't refer to ref->data if it isn't allocated
block: ratelimit handle_bad_sector() message
blk-throttle: Re-use the throtl_set_slice_end()
blk-throttle: Open code __throtl_de/enqueue_tg()
blk-throttle: Move service tree validation out of the throtl_rb_first()
blk-throttle: Move the list operation after list validation
blk-throttle: Fix IO hang for a corner case
blk-throttle: Avoid tracking latency if low limit is invalid
blk-throttle: Avoid getting the current time if tg->last_finish_time is 0
blk-throttle: Remove a meaningless parameter for throtl_downgrade_state()
block: Remove redundant 'return' statement
...

Linus Torvalds
2020-10-14 03:12:44 +0800

13 Oct, 2020

1 commit

ca1b66922 Merge tag 'ras_updates_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull RAS updates from Borislav Petkov:

- Extend the recovery from MCE in kernel space also to processes which
encounter an MCE in kernel space but while copying from user memory
by sending them a SIGBUS on return to user space and umapping the
faulty memory, by Tony Luck and Youquan Song.

- memcpy_mcsafe() rework by splitting the functionality into
copy_mc_to_user() and copy_mc_to_kernel(). This, as a result, enables
support for new hardware which can recover from a machine check
encountered during a fast string copy and makes that the default and
lets the older hardware which does not support that advance recovery,
opt in to use the old, fragile, slow variant, by Dan Williams.

- New AMD hw enablement, by Yazen Ghannam and Akshay Gupta.

- Do not use MSR-tracing accessors in #MC context and flag any fault
while accessing MCA architectural MSRs as an architectural violation
with the hope that such hw/fw misdesigns are caught early during the
hw eval phase and they don't make it into production.

- Misc fixes, improvements and cleanups, as always.

* tag 'ras_updates_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Allow for copy_mc_fragile symbol checksum to be generated
x86/mce: Decode a kernel instruction to determine if it is copying from user
x86/mce: Recover from poison found while copying from user space
x86/mce: Avoid tail copy when machine check terminated a copy from user
x86/mce: Add _ASM_EXTABLE_CPY for copy user access
x86/mce: Provide method to find out the type of an exception handler
x86/mce: Pass pointer to saved pt_regs to severity calculation routines
x86/copy_mc: Introduce copy_mc_enhanced_fast_string()
x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()
x86/mce: Drop AMD-specific "DEFERRED" case from Intel severity rule list
x86/mce: Add Skylake quirk for patrol scrub reported errors
RAS/CEC: Convert to DEFINE_SHOW_ATTRIBUTE()
x86/mce: Annotate mce_rd/wrmsrl() with noinstr
x86/mce/dev-mcelog: Do not update kflags on AMD systems
x86/mce: Stop mce_reign() from re-computing severity for every CPU
x86/mce: Make mce_rdmsrl() panic on an inaccessible MSR
x86/mce: Increase maximum number of banks to 64
x86/mce: Delay clearing IA32_MCG_STATUS to the end of do_machine_check()
x86/MCE/AMD, EDAC/mce_amd: Remove struct smca_hwid.xec_bitmap
RAS/CEC: Fix cec_init() prototype

Linus Torvalds
2020-10-13 01:14:38 +0800

06 Oct, 2020

1 commit

ec6347bb4 x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}() ... Browse Code »

In reaction to a proposal to introduce a memcpy_mcsafe_fast()
implementation Linus points out that memcpy_mcsafe() is poorly named
relative to communicating the scope of the interface. Specifically what
addresses are valid to pass as source, destination, and what faults /
exceptions are handled.

Of particular concern is that even though x86 might be able to handle
the semantics of copy_mc_to_user() with its common copy_user_generic()
implementation other archs likely need / want an explicit path for this
case:

On Fri, May 1, 2020 at 11:28 AM Linus Torvalds wrote:
>
> On Thu, Apr 30, 2020 at 6:21 PM Dan Williams wrote:
> >
> > However now I see that copy_user_generic() works for the wrong reason.
> > It works because the exception on the source address due to poison
> > looks no different than a write fault on the user address to the
> > caller, it's still just a short copy. So it makes copy_to_user() work
> > for the wrong reason relative to the name.
>
> Right.
>
> And it won't work that way on other architectures. On x86, we have a
> generic function that can take faults on either side, and we use it
> for both cases (and for the "in_user" case too), but that's an
> artifact of the architecture oddity.
>
> In fact, it's probably wrong even on x86 - because it can hide bugs -
> but writing those things is painful enough that everybody prefers
> having just one function.

Replace a single top-level memcpy_mcsafe() with either
copy_mc_to_user(), or copy_mc_to_kernel().

Introduce an x86 copy_mc_fragile() name as the rename for the
low-level x86 implementation formerly named memcpy_mcsafe(). It is used
as the slow / careful backend that is supplanted by a fast
copy_mc_generic() in a follow-on patch.

One side-effect of this reorganization is that separating copy_mc_64.S
to its own file means that perf no longer needs to track dependencies
for its memcpy_64.S benchmarks.

[ bp: Massage a bit. ]

Signed-off-by: Dan Williams
Signed-off-by: Borislav Petkov
Reviewed-by: Tony Luck
Acked-by: Michael Ellerman
Cc:
Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com
Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com

Dan Williams
2020-10-06 17:18:04 +0800

25 Sep, 2020

1 commit

a8b456d01 bdi: remove BDI_CAP_SYNCHRONOUS_IO ... Browse Code »

BDI_CAP_SYNCHRONOUS_IO is only checked in the swap code, and used to
decided if ->rw_page can be used on a block device. Just check up for
the method instead. The only complication is that zram needs a second
set of block_device_operations as it can switch between modes that
actually support ->rw_page and those who don't.

Signed-off-by: Christoph Hellwig
Reviewed-by: Jan Kara
Reviewed-by: Johannes Thumshirn
Signed-off-by: Jens Axboe

Christoph Hellwig
2020-09-25 03:43:39 +0800

02 Sep, 2020

1 commit

32f61d675 nvdimm: simplify revalidate_disk handling ... Browse Code »

The nvdimm block driver abuse revalidate_disk in a strange way, and
totally unrelated to what other drivers do. Simplify this by just
calling nvdimm_revalidate_disk (which seems rather misnamed) from the
probe routines, as the additional bdev size revalidation is pointless
at this point, and remove the revalidate_disk methods given that
it can only be triggered from add_disk, which is right before the
manual calls.

Signed-off-by: Christoph Hellwig
Reviewed-by: Josef Bacik
Reviewed-by: Johannes Thumshirn
Signed-off-by: Jens Axboe

Christoph Hellwig
2020-09-02 22:00:07 +0800

18 Aug, 2020

1 commit

62c789270 libnvdimm: KASAN: global-out-of-bounds Read in internal_create_group ... Browse Code »

Because the last member of the "nvdimm_firmware_attributes" array
was not assigned a null ptr, when traversal of "grp->attrs" array
is out of bounds in "create_files" func.

func:
create_files:
->for (i = 0, attr = grp->attrs; *attr && !error; i++, attr++)
->....

BUG: KASAN: global-out-of-bounds in create_files fs/sysfs/group.c:43 [inline]
BUG: KASAN: global-out-of-bounds in internal_create_group+0x9d8/0xb20
fs/sysfs/group.c:149
Read of size 8 at addr ffffffff8a2e4cf0 by task kworker/u17:10/959

CPU: 2 PID: 959 Comm: kworker/u17:10 Not tainted 5.8.0-syzkaller #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
Workqueue: events_unbound async_run_entry_fn
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x18f/0x20d lib/dump_stack.c:118
print_address_description.constprop.0.cold+0x5/0x497 mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513 [inline]
kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
create_files fs/sysfs/group.c:43 [inline]
internal_create_group+0x9d8/0xb20 fs/sysfs/group.c:149
internal_create_groups.part.0+0x90/0x140 fs/sysfs/group.c:189
internal_create_groups fs/sysfs/group.c:185 [inline]
sysfs_create_groups+0x25/0x50 fs/sysfs/group.c:215
device_add_groups drivers/base/core.c:2024 [inline]
device_add_attrs drivers/base/core.c:2178 [inline]
device_add+0x7fd/0x1c40 drivers/base/core.c:2881
nd_async_device_register+0x12/0x80 drivers/nvdimm/bus.c:506
async_run_entry_fn+0x121/0x530 kernel/async.c:123
process_one_work+0x94c/0x1670 kernel/workqueue.c:2269
worker_thread+0x64c/0x1120 kernel/workqueue.c:2415
kthread+0x3b5/0x4a0 kernel/kthread.c:292
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294

The buggy address belongs to the variable:
nvdimm_firmware_attributes+0x10/0x40

Link: https://lore.kernel.org/r/20200812085501.30963-1-qiang.zhang@windriver.com
Link: https://lore.kernel.org/r/20200814150509.225615-1-vaibhav@linux.ibm.com
Fixes: 48001ea50d17f ("PM, libnvdimm: Add runtime firmware activation support")
Reported-by: syzbot+1cf0ffe61aecf46f588f@syzkaller.appspotmail.com
Reported-by: Sandipan Das
Reported-by: Vaibhav Jain
Reviewed-by: Ira Weiny
Signed-off-by: Zqiang
Signed-off-by: Vishal Verma

Zqiang
2020-08-18 04:47:38 +0800

15 Aug, 2020

1 commit

af3bbc12d mm: add thp_size ... Browse Code »

This function returns the number of bytes in a THP. It is like
page_size(), but compiles to just PAGE_SIZE if CONFIG_TRANSPARENT_HUGEPAGE
is disabled.

Signed-off-by: Matthew Wilcox (Oracle)
Signed-off-by: Andrew Morton
Reviewed-by: William Kucharski
Reviewed-by: Zi Yan
Cc: David Hildenbrand
Cc: Mike Kravetz
Cc: "Kirill A. Shutemov"
Link: http://lkml.kernel.org/r/20200629151959.15779-5-willy@infradead.org
Signed-off-by: Linus Torvalds

Matthew Wilcox (Oracle)
2020-08-15 10:56:56 +0800

12 Aug, 2020

2 commits

57b077939 Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost ... Browse Code »

Pull virtio updates from Michael Tsirkin:

- IRQ bypass support for vdpa and IFC

- MLX5 vdpa driver

- Endianness fixes for virtio drivers

- Misc other fixes

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (71 commits)
vdpa/mlx5: fix up endian-ness for mtu
vdpa: Fix pointer math bug in vdpasim_get_config()
vdpa/mlx5: Fix pointer math in mlx5_vdpa_get_config()
vdpa/mlx5: fix memory allocation failure checks
vdpa/mlx5: Fix uninitialised variable in core/mr.c
vdpa_sim: init iommu lock
virtio_config: fix up warnings on parisc
vdpa/mlx5: Add VDPA driver for supported mlx5 devices
vdpa/mlx5: Add shared memory registration code
vdpa/mlx5: Add support library for mlx5 VDPA implementation
vdpa/mlx5: Add hardware descriptive header file
vdpa: Modify get_vq_state() to return error code
net/vdpa: Use struct for set/get vq state
vdpa: remove hard coded virtq num
vdpasim: support batch updating
vhost-vdpa: support IOTLB batching hints
vhost-vdpa: support get/set backend features
vhost: generialize backend features setting/getting
vhost-vdpa: refine ioctl pre-processing
vDPA: dont change vq irq after DRIVER_OK
...

Linus Torvalds
2020-08-12 05:34:17 +0800
4bf5e3611 Merge tag 'libnvdimm-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm ... Browse Code »

Pull libnvdimm updayes from Vishal Verma:
"You'd normally receive this pull request from Dan Williams, but he's
busy watching a newborn (Congrats Dan!), so I'm watching libnvdimm
this cycle.

This adds a new feature in libnvdimm - 'Runtime Firmware Activation',
and a few small cleanups and fixes in libnvdimm and DAX. I'd
originally intended to make separate topic-based pull requests - one
for libnvdimm, and one for DAX, but some of the DAX material fell out
since it wasn't quite ready.

Summary:

- add 'Runtime Firmware Activation' support for NVDIMMs that
advertise the relevant capability

- misc libnvdimm and DAX cleanups"

* tag 'libnvdimm-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
libnvdimm/security: ensure sysfs poll thread woke up and fetch updated attr
libnvdimm/security: the 'security' attr never show 'overwrite' state
libnvdimm/security: fix a typo
ACPI: NFIT: Fix ARS zero-sized allocation
dax: Fix incorrect argument passed to xas_set_err()
ACPI: NFIT: Add runtime firmware activate support
PM, libnvdimm: Add runtime firmware activation support
libnvdimm: Convert to DEVICE_ATTR_ADMIN_RO()
drivers/dax: Expand lock scope to cover the use of addresses
fs/dax: Remove unused size parameter
dax: print error message by pr_info() in __generic_fsdax_supported()
driver-core: Introduce DEVICE_ATTR_ADMIN_{RO,RW}
tools/testing/nvdimm: Emulate firmware activation commands
tools/testing/nvdimm: Prepare nfit_ctl_test() for ND_CMD_CALL emulation
tools/testing/nvdimm: Add command debug messages
tools/testing/nvdimm: Cleanup dimm index passing
ACPI: NFIT: Define runtime firmware activation commands
ACPI: NFIT: Move bus_dsm_mask out of generic nvdimm_bus_descriptor
libnvdimm: Validate command family indices

Linus Torvalds
2020-08-12 01:59:19 +0800

08 Aug, 2020

1 commit

25d8d4eec Merge tag 'powerpc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux ... Browse Code »

Pull powerpc updates from Michael Ellerman:

- Add support for (optionally) using queued spinlocks & rwlocks.

- Support for a new faster system call ABI using the scv instruction on
Power9 or later.

- Drop support for the PROT_SAO mmap/mprotect flag as it will be
unsupported on Power10 and future processors, leaving us with no way
to implement the functionality it requests. This risks breaking
userspace, though we believe it is unused in practice.

- A bug fix for, and then the removal of, our custom stack expansion
checking. We now allow stack expansion up to the rlimit, like other
architectures.

- Remove the remnants of our (previously disabled) topology update
code, which tried to react to NUMA layout changes on virtualised
systems, but was prone to crashes and other problems.

- Add PMU support for Power10 CPUs.

- A change to our signal trampoline so that we don't unbalance the link
stack (branch return predictor) in the signal delivery path.

- Lots of other cleanups, refactorings, smaller features and so on as
usual.

Thanks to: Abhishek Goel, Alastair D'Silva, Alexander A. Klimov, Alexey
Kardashevskiy, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju
T Sudhakar, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Balamuruhan
S, Bharata B Rao, Bill Wendling, Bin Meng, Cédric Le Goater, Chris
Packham, Christophe Leroy, Christoph Hellwig, Daniel Axtens, Dan
Williams, David Lamparter, Desnes A. Nunes do Rosario, Erhard F., Finn
Thain, Frederic Barrat, Ganesh Goudar, Gautham R. Shenoy, Geoff Levand,
Greg Kurz, Gustavo A. R. Silva, Hari Bathini, Harish, Imre Kaloz, Joel
Stanley, Joe Perches, John Crispin, Jordan Niethe, Kajol Jain, Kamalesh
Babulal, Kees Cook, Laurent Dufour, Leonardo Bras, Li RongQing, Madhavan
Srinivasan, Mahesh Salgaonkar, Mark Cave-Ayland, Michal Suchanek, Milton
Miller, Mimi Zohar, Murilo Opsfelder Araujo, Nathan Chancellor, Nathan
Lynch, Naveen N. Rao, Nayna Jain, Nicholas Piggin, Oliver O'Halloran,
Palmer Dabbelt, Pedro Miraglia Franco de Carvalho, Philippe Bergheaud,
Pingfan Liu, Pratik Rajesh Sampat, Qian Cai, Qinglang Miao, Randy
Dunlap, Ravi Bangoria, Sachin Sant, Sam Bobroff, Sandipan Das, Santosh
Sivaraj, Satheesh Rajendran, Shirisha Ganta, Sourabh Jain, Srikar
Dronamraju, Stan Johnson, Stephen Rothwell, Thadeu Lima de Souza
Cascardo, Thiago Jung Bauermann, Tom Lane, Vaibhav Jain, Vladis Dronov,
Wei Yongjun, Wen Xiong, YueHaibing.

* tag 'powerpc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (337 commits)
selftests/powerpc: Fix pkey syscall redefinitions
powerpc: Fix circular dependency between percpu.h and mmu.h
powerpc/powernv/sriov: Fix use of uninitialised variable
selftests/powerpc: Skip vmx/vsx/tar/etc tests on older CPUs
powerpc/40x: Fix assembler warning about r0
powerpc/papr_scm: Add support for fetching nvdimm 'fuel-gauge' metric
powerpc/papr_scm: Fetch nvdimm performance stats from PHYP
cpuidle: pseries: Fixup exit latency for CEDE(0)
cpuidle: pseries: Add function to parse extended CEDE records
cpuidle: pseries: Set the latency-hint before entering CEDE
selftests/powerpc: Fix online CPU selection
powerpc/perf: Consolidate perf_callchain_user_[64|32]()
powerpc/pseries/hotplug-cpu: Remove double free in error path
powerpc/pseries/mobility: Add pr_debug() for device tree changes
powerpc/pseries/mobility: Set pr_fmt()
powerpc/cacheinfo: Warn if cache object chain becomes unordered
powerpc/cacheinfo: Improve diagnostics about malformed cache lists
powerpc/cacheinfo: Use name@unit instead of full DT path in debug messages
powerpc/cacheinfo: Set pr_fmt()
powerpc: fix function annotations to avoid section mismatch warnings with gcc-10
...

Linus Torvalds
2020-08-08 01:33:50 +0800

05 Aug, 2020

1 commit

02e715b7f virtio_pmem: convert to LE accessors ... Browse Code »

Virtio pmem is modern-only. Use LE accessors for config space.

Signed-off-by: Michael S. Tsirkin

Michael S. Tsirkin
2020-08-05 23:08:41 +0800

04 Aug, 2020

4 commits

7f674025d libnvdimm/security: ensure sysfs poll thread woke up and fetch updated attr ... Browse Code »

commit 7d988097c546 ("acpi/nfit, libnvdimm/security: Add security DSM overwrite support")
adds a sysfs_notify_dirent() to wake up userspace poll thread when the "overwrite"
operation has completed. But the notification is issued before the internal
dimm security state and flags have been updated, so the userspace poll thread
wakes up and fetches the not-yet-updated attr and falls back to sleep, forever.
But if user from another terminal issue "ndctl wait-overwrite nmemX" again,
the command returns instantly.

Link: https://lore.kernel.org/r/1596494499-9852-3-git-send-email-jane.chu@oracle.com
Fixes: 7d988097c546 ("acpi/nfit, libnvdimm/security: Add security DSM overwrite support")
Cc: Dave Jiang
Cc: Dan Williams
Reviewed-by: Dave Jiang
Signed-off-by: Jane Chu
Signed-off-by: Vishal Verma

Jane Chu
2020-08-04 08:54:13 +0800
7c02d53df libnvdimm/security: the 'security' attr never show 'overwrite' state ... Browse Code »

'security' attribute displays the security state of an nvdimm.
During normal operation, the nvdimm state maybe one of 'disabled',
'unlocked' or 'locked'. When an admin issues
# ndctl sanitize-dimm nmem0 --overwrite
the attribute is expected to change to 'overwrite' until the overwrite
operation completes.

But tests on our systems show that 'overwrite' is never shown during
the overwrite operation. i.e.
# cat /sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/nmem0/security
unlocked
the attribute remain 'unlocked' through out the operation, consequently
"ndctl wait-overwrite nmem0" command doesn't wait at all.

The driver tracks the state in 'nvdimm->sec.flags': when the operation
starts, it adds an overwrite bit to the flags; and when the operation
completes, it removes the bit. Hence security_show() should check the
'overwrite' bit first, in order to indicate the actual state when multiple
bits are set in the flags.

Link: https://lore.kernel.org/r/1596494499-9852-2-git-send-email-jane.chu@oracle.com
Reviewed-by: Dave Jiang
Signed-off-by: Jane Chu
Signed-off-by: Vishal Verma

Jane Chu
2020-08-04 08:54:13 +0800
dad42d175 libnvdimm/security: fix a typo ... Browse Code »

commit d78c620a2e82 ("libnvdimm/security: Introduce a 'frozen' attribute")
introduced a typo, causing a 'nvdimm->sec.flags' update being overwritten
by the subsequent update meant for 'nvdimm->sec.ext_flags'.

Link: https://lore.kernel.org/r/1596494499-9852-1-git-send-email-jane.chu@oracle.com
Fixes: d78c620a2e82 ("libnvdimm/security: Introduce a 'frozen' attribute")
Cc: Dan Williams
Reviewed-by: Dave Jiang
Signed-off-by: Jane Chu
Signed-off-by: Vishal Verma

Jane Chu
2020-08-04 08:54:13 +0800
382625d0d Merge tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-block ... Browse Code »

Pull core block updates from Jens Axboe:
"Good amount of cleanups and tech debt removals in here, and as a
result, the diffstat shows a nice net reduction in code.

- Softirq completion cleanups (Christoph)

- Stop using ->queuedata (Christoph)

- Cleanup bd claiming (Christoph)

- Use check_events, moving away from the legacy media change
(Christoph)

- Use inode i_blkbits consistently (Christoph)

- Remove old unused writeback congestion bits (Christoph)

- Cleanup/unify submission path (Christoph)

- Use bio_uninit consistently, instead of bio_disassociate_blkg
(Christoph)

- sbitmap cleared bits handling (John)

- Request merging blktrace event addition (Jan)

- sysfs add/remove race fixes (Luis)

- blk-mq tag fixes/optimizations (Ming)

- Duplicate words in comments (Randy)

- Flush deferral cleanup (Yufen)

- IO context locking/retry fixes (John)

- struct_size() usage (Gustavo)

- blk-iocost fixes (Chengming)

- blk-cgroup IO stats fixes (Boris)

- Various little fixes"

* tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-block: (135 commits)
block: blk-timeout: delete duplicated word
block: blk-mq-sched: delete duplicated word
block: blk-mq: delete duplicated word
block: genhd: delete duplicated words
block: elevator: delete duplicated word and fix typos
block: bio: delete duplicated words
block: bfq-iosched: fix duplicated word
iocost_monitor: start from the oldest usage index
iocost: Fix check condition of iocg abs_vdebt
block: Remove callback typedefs for blk_mq_ops
block: Use non _rcu version of list functions for tag_set_list
blk-cgroup: show global disk stats in root cgroup io.stat
blk-cgroup: make iostat functions visible to stat printing
block: improve discard bio alignment in __blkdev_issue_discard()
block: change REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL to be odd numbers
block: defer flush request no matter whether we have elevator
block: make blk_timeout_init() static
block: remove retry loop in ioc_release_fn()
block: remove unnecessary ioc nested locking
block: integrate bd_start_claiming into __blkdev_get
...

Linus Torvalds
2020-08-04 02:57:03 +0800

29 Jul, 2020

3 commits

a1facc1ff ACPI: NFIT: Add runtime firmware activate support ... Browse Code »

Plumb the platform specific backend for the generic libnvdimm firmware
activate interface. Register dimm level operations to arm/disarm
activation, and register bus level operations to report the dynamic
platform-quiesce time relative to the number of dimms armed for firmware
activation.

A new nfit-specific bus attribute "firmware_activate_noidle" is added to
allow the activation to switch between platform enforced, and OS
opportunistic device quiesce. In other words, let the hibernate cycle
handle in-flight device-dma rather than the platform attempting to
increase PCI-E timeouts and the like.

Cc: Dave Jiang
Cc: Ira Weiny
Cc: Vishal Verma
Signed-off-by: Dan Williams
Signed-off-by: Vishal Verma

Dan Williams
2020-07-29 09:29:22 +0800
48001ea50 PM, libnvdimm: Add runtime firmware activation support ... Browse Code »

Abstract platform specific mechanics for nvdimm firmware activation
behind a handful of generic ops. At the bus level ->activate_state()
indicates the unified state (idle, busy, armed) of all DIMMs on the bus,
and ->capability() indicates the system state expectations for activate.
At the DIMM level ->activate_state() indicates the per-DIMM state,
->activate_result() indicates the outcome of the last activation
attempt, and ->arm() attempts to transition the DIMM from 'idle' to
'armed'.

A new hibernate_quiet_exec() facility is added to support firmware
activation in an OS defined system quiesce state. It leverages the fact
that the hibernate-freeze state wants to assert that a memory
hibernation snapshot can be taken. This is in contrast to a platform
firmware defined quiesce state that may forcefully quiet the memory
controller independent of whether an individual device-driver properly
supports hibernate-freeze.

The libnvdimm sysfs interface is extended to support detection of a
firmware activate capability. The mechanism supports enumeration and
triggering of firmware activate, optionally in the
hibernate_quiet_exec() context.

[rafael: hibernate_quiet_exec() proposal]
[vishal: fix up sparse warning, grammar in Documentation/]

Cc: Pavel Machek
Cc: Ira Weiny
Cc: Len Brown
Cc: Jonathan Corbet
Cc: Dave Jiang
Cc: Vishal Verma
Reported-by: kernel test robot
Co-developed-by: "Rafael J. Wysocki"
Signed-off-by: "Rafael J. Wysocki"
Signed-off-by: Dan Williams
Signed-off-by: Vishal Verma

Dan Williams
2020-07-29 09:28:32 +0800
5cf81ce18 libnvdimm: Convert to DEVICE_ATTR_ADMIN_RO() ... Browse Code »

Move libnvdimm sysfs attributes that currently use an open coded
DEVICE_ATTR() to hide sensitive root-only information (physical memory
layout) to the new DEVICE_ATTR_ADMIN_RO() helper.

Cc: Vishal Verma
Cc: Dave Jiang
Cc: Ira Weiny
Signed-off-by: Dan Williams
Signed-off-by: Vishal Verma

Dan Williams
2020-07-29 02:21:10 +0800

26 Jul, 2020

1 commit

92fe2aa85 libnvdimm: Validate command family indices ... Browse Code »

The ND_CMD_CALL format allows for a general passthrough of passlisted
commands targeting a given command set. However there is no validation
of the family index relative to what the bus supports.

- Update the NFIT bus implementation (the only one that supports
ND_CMD_CALL passthrough) to also passlist the valid set of command
family indices.

- Update the generic __nd_ioctl() path to validate that field on behalf
of all implementations.

Fixes: 31eca76ba2fc ("nfit, libnvdimm: limited/whitelisted dimm command marshaling mechanism")
Cc: Vishal Verma
Cc: Dave Jiang
Cc: Ira Weiny
Cc: "Rafael J. Wysocki"
Cc: Len Brown
Cc:
Signed-off-by: Dan Williams
Signed-off-by: Vishal Verma

Dan Williams
2020-07-26 09:34:47 +0800

16 Jul, 2020

2 commits

8c26ab726 powerpc/pmem: Initialize pmem device on newer hardware ... Browse Code »

With kernel now supporting new pmem flush/sync instructions, we can now
enable the kernel to initialize the device. On P10 these devices would
appear with a new compatible string. For PAPR device we have

compatible "ibm,pmemory-v2"

and for OF pmem device we have

compatible "pmem-region-v2"

Signed-off-by: Aneesh Kumar K.V
Signed-off-by: Michael Ellerman
Link: https://lore.kernel.org/r/20200701072235.223558-8-aneesh.kumar@linux.ibm.com

Aneesh Kumar K.V
2020-07-16 11:00:23 +0800
3e79f082e libnvdimm/nvdimm/flush: Allow architecture to override the flush barrier ... Browse Code »

Architectures like ppc64 provide persistent memory specific barriers
that will ensure that all stores for which the modifications are
written to persistent storage by preceding dcbfps and dcbstps
instructions have updated persistent storage before any data
access or data transfer caused by subsequent instructions is initiated.
This is in addition to the ordering done by wmb()

Update nvdimm core such that architecture can use barriers other than
wmb to ensure all previous writes are architecturally visible for
the platform buffer flush.

Signed-off-by: Aneesh Kumar K.V
Reviewed-by: Dan Williams
Signed-off-by: Michael Ellerman
Link: https://lore.kernel.org/r/20200701072235.223558-5-aneesh.kumar@linux.ibm.com

Aneesh Kumar K.V
2020-07-16 11:00:22 +0800

09 Jul, 2020

1 commit

813357fea libnvdimm/security: Fix key lookup permissions ... Browse Code »

As of commit 8c0637e950d6 ("keys: Make the KEY_NEED_* perms an enum rather
than a mask") lookup_user_key() needs an explicit declaration of what it
wants to do with the key. Add KEY_NEED_SEARCH to fix a warning with the
below signature, and fixes the inability to retrieve a key.

WARNING: CPU: 15 PID: 6276 at security/keys/permission.c:35 key_task_permission+0xd3/0x140
[..]
RIP: 0010:key_task_permission+0xd3/0x140
[..]
Call Trace:
lookup_user_key+0xeb/0x6b0
? vsscanf+0x3df/0x840
? key_validate+0x50/0x50
? key_default_cmp+0x20/0x20
nvdimm_get_user_key_payload.part.0+0x21/0x110 [libnvdimm]
nvdimm_security_store+0x67d/0xb20 [libnvdimm]
security_store+0x67/0x1a0 [libnvdimm]
kernfs_fop_write+0xcf/0x1c0
vfs_write+0xde/0x1d0
ksys_write+0x68/0xe0
do_syscall_64+0x5c/0xa0
entry_SYSCALL_64_after_hwframe+0x49/0xb3

Fixes: 8c0637e950d6 ("keys: Make the KEY_NEED_* perms an enum rather than a mask")
Suggested-by: David Howells
Reviewed-by: Dave Jiang
Reviewed-by: Ira Weiny
Cc: Dan Williams
Cc: Vishal Verma
Cc: Dave Jiang
Cc: Ira Weiny
Link: https://lore.kernel.org/r/159297332630.1304143.237026690015653759.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams

Dan Williams
2020-07-09 08:08:01 +0800

01 Jul, 2020

1 commit

c62b37d96 block: move ->make_request_fn to struct block_device_operations ... Browse Code »

The make_request_fn is a little weird in that it sits directly in
struct request_queue instead of an operation vector. Replace it with
a block_device_operations method called submit_bio (which describes much
better what it does). Also remove the request_queue argument to it, as
the queue can be derived pretty trivially from the bio.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2020-07-01 21:27:24 +0800

18 Jun, 2020

1 commit

543094e19 nvdimm/region: always show the 'align' attribute ... Browse Code »

It is possible that a platform that is capable of 'namespace labels'
comes up without the labels properly initialized. In this case, the
region's 'align' attribute is hidden. Howerver, once the user does
initialize he labels, the 'align' attribute still stays hidden, which is
unexpected.

The sysfs_update_group() API is meant to address this, and could be
called during region probe, but it has entanglements with the device
'lockdep_mutex'. Therefore, simply make the 'align' attribute always
visible. It doesn't matter what it says for label-less namespaces, since
it is not possible to change their allocation anyway.

Suggested-by: Dan Williams
Signed-off-by: Vishal Verma
Cc: Dan Williams
Link: https://lore.kernel.org/r/20200520225026.29426-1-vishal.l.verma@intel.com
Signed-off-by: Dan Williams

Vishal Verma
2020-06-18 05:08:31 +0800

14 Jun, 2020

1 commit

d74b15dbb Merge tag 'libnvdimm-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm ... Browse Code »

Pull libnvdimm updates from Dan Williams:
"Small collection of cleanups to rework usage of ->queuedata and the
GUID api"

* tag 'libnvdimm-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
nvdimm/pmem: stop using ->queuedata
nvdimm/btt: stop using ->queuedata
nvdimm/blk: stop using ->queuedata
libnvdimm: Replace guid_copy() with import_guid() where it makes sense

Linus Torvalds
2020-06-14 04:04:36 +0800

09 Jun, 2020

1 commit

e0cf615d7 asm-generic: don't include <linux/mm.h> in cacheflush.h ... Browse Code »

This seems to lead to some crazy include loops when using
asm-generic/cacheflush.h on more architectures, so leave it to the arch
header for now.

[hch@lst.de: fix warning]
Link: http://lkml.kernel.org/r/20200520173520.GA11199@lst.de

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Cc: Will Deacon
Cc: Nick Piggin
Cc: Peter Zijlstra
Cc: Jeff Dike
Cc: Richard Weinberger
Cc: Anton Ivanov
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Borislav Petkov
Cc: "H. Peter Anvin"
Cc: Dan Williams
Cc: Vishal Verma
Cc: Dave Jiang
Cc: Keith Busch
Cc: Ira Weiny
Cc: Arnd Bergmann
Link: http://lkml.kernel.org/r/20200515143646.3857579-7-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-09 02:05:57 +0800

27 May, 2020

1 commit

0fd92f89a nvdimm: use bio_{start,end}_io_acct ... Browse Code »

Switch dm to use the nicer bio accounting helpers.

Signed-off-by: Christoph Hellwig
Reviewed-by: Konstantin Khlebnikov
Signed-off-by: Jens Axboe

Christoph Hellwig
2020-05-27 19:21:23 +0800

14 May, 2020

3 commits

6ec26b8b2 nvdimm/pmem: stop using ->queuedata ... Browse Code »

In preparation for removing queuedata as an argument to
make_request_fn() drop the dependency ->queuedata.

Signed-off-by: Christoph Hellwig
Link: https://lore.kernel.org/r/20200508161517.252308-16-hch@lst.de
Signed-off-by: Dan Williams

Christoph Hellwig
2020-05-14 06:15:37 +0800
5713bcc3f nvdimm/btt: stop using ->queuedata ... Browse Code »

In preparation for removing queuedata as an argument to
make_request_fn() drop the dependency ->queuedata.

Signed-off-by: Christoph Hellwig
Link: https://lore.kernel.org/r/20200508161517.252308-15-hch@lst.de
Signed-off-by: Dan Williams

Christoph Hellwig
2020-05-14 06:15:37 +0800
daa28975d nvdimm/blk: stop using ->queuedata ... Browse Code »

In preparation for removing queuedata as an argument to
make_request_fn() drop the dependency ->queuedata.

Signed-off-by: Christoph Hellwig
Link: https://lore.kernel.org/r/20200508161517.252308-14-hch@lst.de
Signed-off-by: Dan Williams

Christoph Hellwig
2020-05-14 06:14:22 +0800

09 Apr, 2020

1 commit

9b06860d7 Merge tag 'libnvdimm-for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm ... Browse Code »

Pull libnvdimm and dax updates from Dan Williams:
"There were multiple touches outside of drivers/nvdimm/ this round to
add cross arch compatibility to the devm_memremap_pages() interface,
enhance numa information for persistent memory ranges, and add a
zero_page_range() dax operation.

This cycle I switched from the patchwork api to Konstantin's b4 script
for collecting tags (from x86, PowerPC, filesystem, and device-mapper
folks), and everything looks to have gone ok there. This has all
appeared in -next with no reported issues.

Summary:

- Add support for region alignment configuration and enforcement to
fix compatibility across architectures and PowerPC page size
configurations.

- Introduce 'zero_page_range' as a dax operation. This facilitates
filesystem-dax operation without a block-device.

- Introduce phys_to_target_node() to facilitate drivers that want to
know resulting numa node if a given reserved address range was
onlined.

- Advertise a persistence-domain for of_pmem and papr_scm. The
persistence domain indicates where cpu-store cycles need to reach
in the platform-memory subsystem before the platform will consider
them power-fail protected.

- Promote numa_map_to_online_node() to a cross-kernel generic
facility.

- Save x86 numa information to allow for node-id lookups for reserved
memory ranges, deploy that capability for the e820-pmem driver.

- Pick up some miscellaneous minor fixes, that missed v5.6-final,
including a some smatch reports in the ioctl path and some unit
test compilation fixups.

- Fixup some flexible-array declarations"

* tag 'libnvdimm-for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (29 commits)
dax: Move mandatory ->zero_page_range() check in alloc_dax()
dax,iomap: Add helper dax_iomap_zero() to zero a range
dax: Use new dax zero page method for zeroing a page
dm,dax: Add dax zero_page_range operation
s390,dcssblk,dax: Add dax zero_page_range operation to dcssblk driver
dax, pmem: Add a dax operation zero_page_range
pmem: Add functions for reading/writing page to/from pmem
libnvdimm: Update persistence domain value for of_pmem and papr_scm device
tools/test/nvdimm: Fix out of tree build
libnvdimm/region: Fix build error
libnvdimm/region: Replace zero-length array with flexible-array member
libnvdimm/label: Replace zero-length array with flexible-array member
ACPI: NFIT: Replace zero-length array with flexible-array member
libnvdimm/region: Introduce an 'align' attribute
libnvdimm/region: Introduce NDD_LABELING
libnvdimm/namespace: Enforce memremap_compat_align()
libnvdimm/pfn: Prevent raw mode fallback if pfn-infoblock valid
libnvdimm: Out of bounds read in __nd_ioctl()
acpi/nfit: improve bounds checking for 'func'
mm/memremap_pages: Introduce memremap_compat_align()
...

Linus Torvalds
2020-04-09 12:03:40 +0800

03 Apr, 2020

5 commits

f6d2b802f Merge branch 'for-5.7/libnvdimm' into libnvdimm-for-next ... Browse Code »

- Introduce 'zero_page_range' as a dax operation. This facilitates
filesystem-dax operation without a block-device.

- Advertise a persistence-domain for of_pmem and papr_scm. The
persistence domain indicates where cpu-store cycles need to reach in
the platform-memory subsystem before the platform will consider them
power-fail protected.

- Fixup some flexible-array declarations.

Dan Williams
2020-04-03 10:55:17 +0800
d3b88655c Merge branch 'for-5.7/numa' into libnvdimm-for-next ... Browse Code »

- Promote numa_map_to_online_node() to a cross-kernel generic facility.

- Save x86 numa information to allow for node-id lookups for reserved
memory ranges, deploy that capability for the e820-pmem driver.

- Introduce phys_to_target_node() to facilitate drivers that want to
know resulting numa node if a given reserved address range was
onlined.

Dan Williams
2020-04-03 10:50:31 +0800
91bf79bcb Merge branch 'for-5.6/libnvdimm-fixes' into libnvdimm-for-next ... Browse Code »

Pick up some miscellaneous minor fixes, that missed v5.6-final,
including a some smatch reports in the ioctl path and some unit test
compilation fixups.

Dan Williams
2020-04-03 10:47:12 +0800
4e4ced937 dax: Move mandatory ->zero_page_range() check in alloc_dax() ... Browse Code »

zero_page_range() dax operation is mandatory for dax devices. Right now
that check happens in dax_zero_page_range() function. Dan thinks that's
too late and its better to do the check earlier in alloc_dax().

I also modified alloc_dax() to return pointer with error code in it in
case of failure. Right now it returns NULL and caller assumes failure
happened due to -ENOMEM. But with this ->zero_page_range() check, I
need to return -EINVAL instead.

Signed-off-by: Vivek Goyal
Link: https://lore.kernel.org/r/20200401161125.GB9398@redhat.com
Signed-off-by: Dan Williams

Vivek Goyal
2020-04-03 10:15:03 +0800
f605a263e dax, pmem: Add a dax operation zero_page_range ... Browse Code »

Add a dax operation zero_page_range, to zero a page. This will also clear any
known poison in the page being zeroed.

As of now, zeroing of one page is allowed in a single call. There
are no callers which are trying to zero more than a page in a single call.
Once we grow the callers which zero more than a page in single call, we
can add that support. Primary reason for not doing that yet is that this
will add little complexity in dm implementation where a range might be
spanning multiple underlying targets and one will have to split the range
into multiple sub ranges and call zero_page_range() on individual targets.

Suggested-by: Christoph Hellwig
Signed-off-by: Vivek Goyal
Reviewed-by: Pankaj Gupta
Link: https://lore.kernel.org/r/20200228163456.1587-3-vgoyal@redhat.com
Signed-off-by: Dan Williams

Vivek Goyal
2020-04-03 10:15:03 +0800