28 Feb, 2018

2 commits

  • commit 41fce90f26333c4fa82e8e43b9ace86c4e8a0120 upstream.

    The following namespace configuration attempt:

    # ndctl create-namespace -e namespace0.0 -m devdax -a 1G -f
    libndctl: ndctl_dax_enable: dax0.1: failed to enable
    Error: namespace0.0: failed to enable

    failed to reconfigure namespace: No such device or address

    ...fails when the backing memory range is not physically aligned to 1G:

    # cat /proc/iomem | grep Persistent
    210000000-30fffffff : Persistent Memory (legacy)

    In the above example the 4G persistent memory range starts and ends on a
    256MB boundary.

    We handle this case correctly when needing to handle cases that violate
    section alignment (128MB) collisions against "System RAM", and we simply
    need to extend that padding/truncation for the 1GB alignment use case.

    Cc:
    Fixes: 315c562536c4 ("libnvdimm, pfn: add 'align' attribute...")
    Reported-and-tested-by: Jane Chu
    Signed-off-by: Dan Williams
    Signed-off-by: Greg Kroah-Hartman

    Dan Williams
     
  • commit 58738c495e15badd2015e19ff41f1f1ed55200bc upstream.

    Dan reports:
    The patch 62232e45f4a2: "libnvdimm: control (ioctl) messages for
    nvdimm_bus and nvdimm devices" from Jun 8, 2015, leads to the
    following static checker warning:

    drivers/nvdimm/bus.c:1018 __nd_ioctl()
    warn: integer overflows 'buf_len'

    From a casual review, this seems like it might be a real bug. On
    the first iteration we load some data into in_env[]. On the second
    iteration we read a use controlled "in_size" from nd_cmd_in_size().
    It can go up to UINT_MAX - 1. A high number means we will fill the
    whole in_env[] buffer. But we potentially keep looping and adding
    more to in_len so now it can be any value.

    It simple enough to change, but it feels weird that we keep looping
    even though in_env is totally full. Shouldn't we just return an
    error if we don't have space for desc->in_num.

    We keep looping because the size of the total input is allowed to be
    bigger than the 'envelope' which is a subset of the payload that tells
    us how much data to expect. For safety explicitly check that buf_len
    does not overflow which is what the checker flagged.

    Cc:
    Fixes: 62232e45f4a2: "libnvdimm: control (ioctl) messages for nvdimm_bus..."
    Reported-by: Dan Carpenter
    Signed-off-by: Dan Williams
    Signed-off-by: Greg Kroah-Hartman

    Dan Williams
     

24 Jan, 2018

1 commit

  • commit 24e3a7fb60a9187e5df90e5fa655ffc94b9c4f77 upstream.

    Due to a spec misinterpretation, the Linux implementation of the BTT log
    area had different padding scheme from other implementations, such as
    UEFI and NVML.

    This fixes the padding scheme, and defaults to it for new BTT layouts.
    We attempt to detect the padding scheme in use when probing for an
    existing BTT. If we detect the older/incompatible scheme, we continue
    using it.

    Reported-by: Juston Li
    Cc: Dan Williams
    Cc:
    Fixes: 5212e11fde4d ("nd_btt: atomic sector updates")
    Signed-off-by: Vishal Verma
    Signed-off-by: Dan Williams
    Signed-off-by: Greg Kroah-Hartman

    Vishal Verma
     

30 Dec, 2017

1 commit

  • commit 19deaa217bc04e83b59b5e8c8229eb0e53ad9efc upstream.

    The alignment checks at pfn driver startup fail to properly account for
    the 'start_pad' in the case where the namespace is misaligned relative
    to its internal alignment. This is typically triggered in 1G aligned
    namespace, but could theoretically trigger with small namespace
    alignments. When this triggers the kernel reports messages of the form:

    dax2.1: bad offset: 0x3c000000 dax disabled align: 0x40000000

    Fixes: 1ee6667cd8d1 ("libnvdimm, pfn, dax: fix initialization vs autodetect...")
    Reported-by: Jane Chu
    Signed-off-by: Dan Williams
    Signed-off-by: Greg Kroah-Hartman

    Dan Williams
     

30 Nov, 2017

3 commits

  • commit c1fb3542074fd0c4d901d778bd52455111e4eb6f upstream.

    For the same reason that /proc/iomem returns 0's for non-root readers
    and acpi tables are root-only, make the 'resource' attribute for
    namespace devices only readable by root. Otherwise we disclose physical
    address information.

    Fixes: bf9bccc14c05 ("libnvdimm: pmem label sets and namespace instantiation")
    Reported-by: Dave Hansen
    Signed-off-by: Dan Williams
    Signed-off-by: Greg Kroah-Hartman

    Dan Williams
     
  • commit b18d4b8a25af6fe83d7692191d6ff962ea611c4f upstream.

    The set of valid sequence numbers is {1,2,3}. The specification
    indicates that an implementation should consider 0 a sign of a critical
    error:

    UEFI 2.7: 13.19 NVDIMM Label Protocol

    Software never writes the sequence number 00, so a correctly
    check-summed Index Block with this sequence number probably indicates a
    critical error. When software discovers this case it treats it as an
    invalid Index Block indication.

    While the expectation is that the invalid block is just thrown away, the
    Robustness Principle says we should fix this to make both sequence
    numbers valid.

    Fixes: f524bf271a5c ("libnvdimm: write pmem label set")
    Reported-by: Juston Li
    Signed-off-by: Dan Williams
    Signed-off-by: Greg Kroah-Hartman

    Dan Williams
     
  • commit 26417ae4fc6108f8db436f24108b08f68bdc520e upstream.

    For the same reason that /proc/iomem returns 0's for non-root readers
    and acpi tables are root-only, make the 'resource' attribute for pfn
    devices only readable by root. Otherwise we disclose physical address
    information.

    Fixes: f6ed58c70d14 ("libnvdimm, pfn: 'resource'-address and 'size'...")
    Reported-by: Dave Hansen
    Signed-off-by: Dan Williams
    Signed-off-by: Greg Kroah-Hartman

    Dan Williams
     

28 Jul, 2017

2 commits

  • commit 4e3f0701f25ab194c5362576b1146a1e6cc6c2e7 upstream.

    __add_badblock_range() does not account sector alignment when
    it sets 'num_sectors'. Therefore, an ARS error record range
    spanning across two sectors is set to a single sector length,
    which leaves the 2nd sector unprotected.

    Change __add_badblock_range() to set 'num_sectors' properly.

    Fixes: 0caeef63e6d2 ("libnvdimm: Add a poison list and export badblocks")
    Signed-off-by: Toshi Kani
    Reviewed-by: Vishal Verma
    Signed-off-by: Dan Williams
    Signed-off-by: Greg Kroah-Hartman

    Toshi Kani
     
  • commit c13c43d54f2c6a3be1c675766778ac1ad8dfbfcc upstream.

    btt_rw_page was not propagating errors frm btt_do_bvec, resulting in any
    IO errors via the rw_page path going unnoticed. the pmem driver recently
    fixed this in e10624f pmem: fail io-requests to known bad blocks
    but same problem in BTT went neglected.

    Fixes: 5212e11fde4d ("nd_btt: atomic sector updates")
    Cc: Toshi Kani
    Cc: Dan Williams
    Cc: Jeff Moyer
    Signed-off-by: Vishal Verma
    Signed-off-by: Dan Williams
    Signed-off-by: Greg Kroah-Hartman

    Vishal Verma
     

05 Jul, 2017

1 commit

  • [ Upstream commit d47d1d27fd6206c18806440f6ebddf51a806be4f ]

    The read_pmem() function uses memcpy_mcsafe() on x86 where an EFAULT
    error code indicates a failed read. Block I/O should use EIO to
    indicate failure. Other pmem code paths (like bad blocks) already use
    EIO so let's be consistent.

    This fixes compatibility with consumers like btrfs that try to parse the
    specific error code rather than treat all errors the same.

    Reviewed-by: Jeff Moyer
    Signed-off-by: Stefan Hajnoczi
    Signed-off-by: Dan Williams
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Stefan Hajnoczi
     

25 May, 2017

1 commit

  • commit 8d13c0290655b883df9083a2a0af0d782bc38aef upstream.

    ND_CMD_CLEAR_ERROR command returns 'clear_err.cleared', the length
    of error actually cleared, which may be smaller than its requested
    'len'.

    Change nvdimm_clear_poison() to call nvdimm_forget_poison() with
    'clear_err.cleared' when this value is valid.

    Fixes: e046114af5fc ("libnvdimm: clear the internal poison_list when clearing badblocks")
    Cc: Dave Jiang
    Cc: Vishal Verma
    Signed-off-by: Toshi Kani
    Signed-off-by: Dan Williams
    Signed-off-by: Greg Kroah-Hartman

    Toshi Kani
     

20 May, 2017

3 commits

  • commit d5483feda85a8f39ee2e940e279547c686aac30c upstream.

    Fix failures to create namespaces due to the vmem_altmap not advertising
    enough free space to store the memmap.

    WARNING: CPU: 15 PID: 8022 at arch/x86/mm/init_64.c:656 arch_add_memory+0xde/0xf0
    [..]
    Call Trace:
    dump_stack+0x63/0x83
    __warn+0xcb/0xf0
    warn_slowpath_null+0x1d/0x20
    arch_add_memory+0xde/0xf0
    devm_memremap_pages+0x244/0x440
    pmem_attach_disk+0x37e/0x490 [nd_pmem]
    nd_pmem_probe+0x7e/0xa0 [nd_pmem]
    nvdimm_bus_probe+0x71/0x120 [libnvdimm]
    driver_probe_device+0x2bb/0x460
    bind_store+0x114/0x160
    drv_attr_store+0x25/0x30

    In commit 658922e57b84 "libnvdimm, pfn: fix memmap reservation sizing"
    we arranged for the capacity to be allocated, but failed to also update
    the 'npfns' parameter. This leads to cases where there is enough
    capacity reserved to hold all the allocated sections, but
    vmemmap_populate_hugepages() still encounters -ENOMEM from
    altmap_alloc_block_buf().

    This fix is a stop-gap until we can teach the core memory hotplug
    implementation to permit sub-section hotplug.

    Fixes: 658922e57b84 ("libnvdimm, pfn: fix memmap reservation sizing")
    Reported-by: Anisha Allada
    Signed-off-by: Dan Williams
    Signed-off-by: Greg Kroah-Hartman

    Dan Williams
     
  • commit b2518c78ce76896f0f8f7940bf02104b227e1709 upstream.

    The following BUG was observed when nd_pmem_notify() was called
    for a BTT device. The use of a pmem_device pointer is not valid
    with BTT.

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
    IP: nd_pmem_notify+0x30/0xf0 [nd_pmem]
    Call Trace:
    nd_device_notify+0x40/0x50
    child_notify+0x10/0x20
    device_for_each_child+0x50/0x90
    nd_region_notify+0x20/0x30
    nd_device_notify+0x40/0x50
    nvdimm_region_notify+0x27/0x30
    acpi_nfit_scrub+0x341/0x590 [nfit]
    process_one_work+0x197/0x450
    worker_thread+0x4e/0x4a0
    kthread+0x109/0x140

    Fix nd_pmem_notify() by setting nd_region and badblocks pointers
    properly for BTT.

    Cc: Vishal Verma
    Fixes: 719994660c24 ("libnvdimm: async notification support")
    Signed-off-by: Toshi Kani
    Signed-off-by: Dan Williams
    Signed-off-by: Greg Kroah-Hartman

    Toshi Kani
     
  • commit bc042fdfbb92b5b13421316b4548e2d6e98eed37 upstream.

    In the case where a dimm does not have any associated flush hints the
    ndrd->flush_wpq array may be uninitialized leading to crashes with the
    following signature:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
    IP: region_visible+0x10f/0x160 [libnvdimm]

    Call Trace:
    internal_create_group+0xbe/0x2f0
    sysfs_create_groups+0x40/0x80
    device_add+0x2d8/0x650
    nd_async_device_register+0x12/0x40 [libnvdimm]
    async_run_entry_fn+0x39/0x170
    process_one_work+0x212/0x6c0
    ? process_one_work+0x197/0x6c0
    worker_thread+0x4e/0x4a0
    kthread+0x10c/0x140
    ? process_one_work+0x6c0/0x6c0
    ? kthread_create_on_node+0x60/0x60
    ret_from_fork+0x31/0x40

    Reviewed-by: Jeff Moyer
    Fixes: f284a4f23752 ("libnvdimm: introduce nvdimm_flush() and nvdimm_has_flush()")
    Signed-off-by: Dan Williams
    Signed-off-by: Greg Kroah-Hartman

    Dan Williams
     

21 Apr, 2017

2 commits

  • commit 0beb2012a1722633515c8aaa263c73449636c893 upstream.

    Holding the reconfig_mutex over a potential userspace fault sets up a
    lockdep dependency chain between filesystem-DAX and the libnvdimm ioctl
    path. Move the user access outside of the lock.

    [ INFO: possible circular locking dependency detected ]
    4.11.0-rc3+ #13 Tainted: G W O
    -------------------------------------------------------
    fallocate/16656 is trying to acquire lock:
    (&nvdimm_bus->reconfig_mutex){+.+.+.}, at: [] nvdimm_bus_lock+0x21/0x30 [libnvdimm]
    but task is already holding lock:
    (jbd2_handle){++++..}, at: [] start_this_handle+0x104/0x460

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #2 (jbd2_handle){++++..}:
    lock_acquire+0xbd/0x200
    start_this_handle+0x16a/0x460
    jbd2__journal_start+0xe9/0x2d0
    __ext4_journal_start_sb+0x89/0x1c0
    ext4_dirty_inode+0x32/0x70
    __mark_inode_dirty+0x235/0x670
    generic_update_time+0x87/0xd0
    touch_atime+0xa9/0xd0
    ext4_file_mmap+0x90/0xb0
    mmap_region+0x370/0x5b0
    do_mmap+0x415/0x4f0
    vm_mmap_pgoff+0xd7/0x120
    SyS_mmap_pgoff+0x1c5/0x290
    SyS_mmap+0x22/0x30
    entry_SYSCALL_64_fastpath+0x1f/0xc2

    -> #1 (&mm->mmap_sem){++++++}:
    lock_acquire+0xbd/0x200
    __might_fault+0x70/0xa0
    __nd_ioctl+0x683/0x720 [libnvdimm]
    nvdimm_ioctl+0x8b/0xe0 [libnvdimm]
    do_vfs_ioctl+0xa8/0x740
    SyS_ioctl+0x79/0x90
    do_syscall_64+0x6c/0x200
    return_from_SYSCALL_64+0x0/0x7a

    -> #0 (&nvdimm_bus->reconfig_mutex){+.+.+.}:
    __lock_acquire+0x16b6/0x1730
    lock_acquire+0xbd/0x200
    __mutex_lock+0x88/0x9b0
    mutex_lock_nested+0x1b/0x20
    nvdimm_bus_lock+0x21/0x30 [libnvdimm]
    nvdimm_forget_poison+0x25/0x50 [libnvdimm]
    nvdimm_clear_poison+0x106/0x140 [libnvdimm]
    pmem_do_bvec+0x1c2/0x2b0 [nd_pmem]
    pmem_make_request+0xf9/0x270 [nd_pmem]
    generic_make_request+0x118/0x3b0
    submit_bio+0x75/0x150

    Fixes: 62232e45f4a2 ("libnvdimm: control (ioctl) messages for nvdimm_bus and nvdimm devices")
    Cc: Dave Jiang
    Reported-by: Vishal Verma
    Signed-off-by: Dan Williams
    Signed-off-by: Greg Kroah-Hartman

    Dan Williams
     
  • commit fe514739d8538783749d3ce72f78e5a999ea5668 upstream.

    Commit a1f3e4d6a0c3 "libnvdimm, region: update nd_region_available_dpa()
    for multi-pmem support" reworked blk dpa (DIMM Physical Address)
    accounting to comprehend multiple pmem namespace allocations aliasing
    with a given blk-dpa range.

    The following call trace is a result of failing to account for allocated
    blk capacity.

    WARNING: CPU: 1 PID: 2433 at tools/testing/nvdimm/../../../drivers/nvdimm/names
    4 size_store+0x6f3/0x930 [libnvdimm]
    nd_region region5: allocation underrun: 0x0 of 0x1000000 bytes
    [..]
    Call Trace:
    dump_stack+0x86/0xc3
    __warn+0xcb/0xf0
    warn_slowpath_fmt+0x5f/0x80
    size_store+0x6f3/0x930 [libnvdimm]
    dev_attr_store+0x18/0x30

    If a given blk-dpa allocation does not alias with any pmem ranges then
    the full allocation should be accounted as busy space, not the size of
    the current pmem contribution to the region.

    The thinkos that led to this confusion was not realizing that the struct
    resource management is already guaranteeing no collisions between pmem
    allocations and blk allocations on the same dimm. Also, we do not try to
    support blk allocations in aliased pmem holes.

    This patch also fixes a case where the available blk goes negative.

    Fixes: a1f3e4d6a0c3 ("libnvdimm, region: update nd_region_available_dpa() for multi-pmem support").
    Reported-by: Dariusz Dokupil
    Reported-by: Dave Jiang
    Reported-by: Vishal Verma
    Tested-by: Dave Jiang
    Tested-by: Vishal Verma
    Signed-off-by: Dan Williams
    Signed-off-by: Greg Kroah-Hartman

    Dan Williams
     

15 Mar, 2017

1 commit

  • commit 86ef58a4e35e8fa66afb5898cf6dec6a3bb29f67 upstream.

    The interleave-set cookie is a sum that sanity checks the composition of
    an interleave set has not changed from when the namespace was initially
    created. The checksum is calculated by sorting the DIMMs by their
    location in the interleave-set. The comparison for the sort must be
    64-bit wide, not byte-by-byte as performed by memcmp() in the broken
    case.

    Fix the implementation to accept correct cookie values in addition to
    the Linux "memcmp" order cookies, but only allow correct cookies to be
    generated going forward. It does mean that namespaces created by
    third-party-tooling, or created by newer kernels with this fix, will not
    validate on older kernels. However, there are a couple mitigating
    conditions:

    1/ platforms with namespace-label capable NVDIMMs are not widely
    available.

    2/ interleave-sets with a single-dimm are by definition not affected
    (nothing to sort). This covers the QEMU-KVM NVDIMM emulation case.

    The cookie stored in the namespace label will be fixed by any write the
    namespace label, the most straightforward way to achieve this is to
    write to the "alt_name" attribute of a namespace in sysfs.

    Fixes: eaf961536e16 ("libnvdimm, nfit: add interleave-set state-tracking infrastructure")
    Reported-by: Nicholas Moulin
    Tested-by: Nicholas Moulin
    Signed-off-by: Dan Williams
    Signed-off-by: Greg Kroah-Hartman

    Dan Williams
     

15 Feb, 2017

2 commits

  • commit bfb34527a32a1a576d9bfb7026d3ab0369a6cd60 upstream.

    When vmemmap_populate() allocates space for the memmap it does so in 2MB
    sized chunks. The libnvdimm-pfn driver incorrectly accounts for this
    when the alignment of the device is set to 4K. When this happens we
    trigger memory allocation failures in altmap_alloc_block_buf() and
    trigger warnings of the form:

    WARNING: CPU: 0 PID: 3376 at arch/x86/mm/init_64.c:656 arch_add_memory+0xe4/0xf0
    [..]
    Call Trace:
    dump_stack+0x86/0xc3
    __warn+0xcb/0xf0
    warn_slowpath_null+0x1d/0x20
    arch_add_memory+0xe4/0xf0
    devm_memremap_pages+0x29b/0x4e0

    Fixes: 315c562536c4 ("libnvdimm, pfn: add 'align' attribute, default to HPAGE_SIZE")
    Signed-off-by: Dan Williams
    Signed-off-by: Greg Kroah-Hartman

    Dan Williams
     
  • commit 9d032f4201d39e5cf43a8709a047e481f5723fdc upstream.

    Given that the naming of pmem devices changes from the pmemX form to the
    pmemX.Y form when namespace id is greater than 0, arrange for namespaces
    with id-0 to be exempt from deletion. Otherwise a simple reconfiguration
    of an existing namespace to a new mode results in a name change of the
    resulting block device:

    # ndctl list --namespace=namespace1.0
    {
    "dev":"namespace1.0",
    "mode":"raw",
    "size":2147483648,
    "uuid":"3dadf3dc-89b9-4b24-b20e-abc8a4707ce3",
    "blockdev":"pmem1"
    }

    # ndctl create-namespace --reconfig=namespace1.0 --mode=memory --force
    {
    "dev":"namespace1.1",
    "mode":"memory",
    "size":2111832064,
    "uuid":"7b4a6341-7318-4219-a02c-fb57c0bbf613",
    "blockdev":"pmem1.1"
    }

    This change does require tooling changes to explicitly look for
    namespaceX.0 if the seed has already advanced to another namespace.

    Fixes: 98a29c39dc68 ("libnvdimm, namespace: allow creation of multiple pmem-namespaces per region")
    Reviewed-by: Johannes Thumshirn
    Signed-off-by: Dan Williams
    Signed-off-by: Greg Kroah-Hartman

    Dan Williams
     

26 Jan, 2017

1 commit

  • commit 1f19b983a8877f81763fab3e693c6befe212736d upstream.

    Commit 98a29c39dc68 ("libnvdimm, namespace: allow creation of multiple
    pmem-namespaces per region") added support for establishing additional
    pmem namespace beyond the seed device, similar to blk namespaces.
    However, it neglected to delete the namespace when the size is set to
    zero.

    Fixes: 98a29c39dc68 ("libnvdimm, namespace: allow creation of multiple pmem-namespaces per region")
    Signed-off-by: Dan Williams
    Signed-off-by: Greg Kroah-Hartman

    Dan Williams
     

09 Jan, 2017

1 commit

  • commit af7d9f0c57941b465043681cb5c3410f7f3f1a41 upstream.

    Fix the format specifier so that the attribute can be parsed correctly.
    Currently it returns decimal 1000 for a 4096-byte alignment.

    Reported-by: Dave Jiang
    Fixes: 315c562536c4 ("libnvdimm, pfn: add 'align' attribute, default to HPAGE_SIZE")
    Signed-off-by: Dan Williams
    Signed-off-by: Greg Kroah-Hartman

    Dan Williams
     

07 Dec, 2016

1 commit

  • Given ambiguities in the ACPI 6.1 definition of the "Output (Size)"
    field of the ARS (Address Range Scrub) Status command, a firmware
    implementation may in practice return 0, 4, or 8 to indicate that there
    is no output payload to process.

    The specification states "Size of Output Buffer in bytes, including this
    field.". However, 'Output Buffer' is also the name of the entire
    payload, and earlier in the specification it states "Max Query ARS
    Status Output Buffer Size: Maximum size of buffer (including the Status
    and Extended Status fields)".

    Without this fix if the BIOS happens to return 0 it causes memory
    corruption as evidenced by this result from the acpi_nfit_ctl() unit
    test.

    ars_status00000000: 00020000 00000000 ........
    BUG: stack guard page was hit at ffffc90001750000 (stack is ffffc9000174c000..ffffc9000174ffff)
    kernel stack overflow (page fault): 0000 [#1] SMP DEBUG_PAGEALLOC
    task: ffff8803332d2ec0 task.stack: ffffc9000174c000
    RIP: 0010:[] [] __memcpy+0x12/0x20
    RSP: 0018:ffffc9000174f9a8 EFLAGS: 00010246
    RAX: ffffc9000174fab8 RBX: 0000000000000000 RCX: 000000001fffff56
    RDX: 0000000000000000 RSI: ffff8803231f5a08 RDI: ffffc90001750000
    RBP: ffffc9000174fa88 R08: ffffc9000174fab0 R09: ffff8803231f54b8
    R10: 0000000000000008 R11: 0000000000000001 R12: 0000000000000000
    R13: 0000000000000000 R14: 0000000000000003 R15: ffff8803231f54a0
    FS: 00007f3a611af640(0000) GS:ffff88033ed00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffffc90001750000 CR3: 0000000325b20000 CR4: 00000000000406e0
    Stack:
    ffffffffa00bc60d 0000000000000008 ffffc90000000001 ffffc9000174faac
    0000000000000292 ffffffffa00c24e4 ffffffffa00c2914 0000000000000000
    0000000000000000 ffffffff00000003 ffff880331ae8ad0 0000000800000246
    Call Trace:
    [] ? acpi_nfit_ctl+0x49d/0x750 [nfit]
    [] nfit_test_probe+0x670/0xb1b [nfit_test]

    Cc:
    Fixes: 747ffe11b440 ("libnvdimm, tools/testing/nvdimm: fix 'ars_status' output buffer sizing")
    Signed-off-by: Dan Williams

    Dan Williams
     

28 Oct, 2016

1 commit

  • A bugfix just tried to address a randconfig build problem and introduced
    a variant of the same problem: with CONFIG_LIBNVDIMM=y and
    CONFIG_NVDIMM_DAX=m, the nvdimm module now fails to link:

    drivers/nvdimm/built-in.o: In function `to_nd_device_type':
    bus.c:(.text+0x1b5d): undefined reference to `is_nd_dax'
    drivers/nvdimm/built-in.o: In function `nd_region_notify_driver_action.constprop.2':
    region_devs.c:(.text+0x6b6c): undefined reference to `is_nd_dax'
    region_devs.c:(.text+0x6b8c): undefined reference to `to_nd_dax'
    drivers/nvdimm/built-in.o: In function `nd_region_probe':
    region.c:(.text+0x70f3): undefined reference to `nd_dax_create'
    drivers/nvdimm/built-in.o: In function `mode_show':
    namespace_devs.c:(.text+0xa196): undefined reference to `is_nd_dax'
    drivers/nvdimm/built-in.o: In function `nvdimm_namespace_common_probe':
    (.text+0xa55f): undefined reference to `is_nd_dax'
    drivers/nvdimm/built-in.o: In function `nvdimm_namespace_common_probe':
    (.text+0xa56e): undefined reference to `to_nd_dax'

    This reverts the earlier fix, making NVDIMM_DAX a 'bool' option again
    as it should be (it gets linked into the libnvdimm module). To fix
    the original problem, I'm adding a dependency on LIBNVDIMM to
    DEV_DAX_PMEM, which ensures we can't have that one built-in if the
    rest is a module.

    Fixes: 4e65e9381c7a ("/dev/dax: fix Kconfig dependency build breakage")
    Signed-off-by: Arnd Bergmann
    Reviewed-by: Ross Zwisler
    Signed-off-by: Dan Williams

    Arnd Bergmann
     

20 Oct, 2016

2 commits

  • ACPI Clear Uncorrectable Error DSM function may fail or may be
    unsupported on a platform. pmem_clear_poison() returns without clearing
    badblocks in such cases. This failure is detected at the next read
    (-EIO).

    This behavior can lead to an issue when user keeps writing but does not
    read immediately. For instance, flight recorder file may be only read
    when it is necessary for troubleshooting.

    Change pmem_do_bvec() and pmem_clear_poison() to return -EIO so that
    filesystem can log an error message on a write error.

    Cc: Vishal Verma
    Signed-off-by: Toshi Kani
    Signed-off-by: Dan Williams

    Toshi Kani
     
  • If the kcalloc() fails then "devs" can be NULL and we dereference it
    checking "devs[i]".

    Fixes: 1b40e09a1232 ('libnvdimm: blk labels and namespace instantiation')
    Signed-off-by: Dan Carpenter
    Signed-off-by: Dan Williams

    Dan Carpenter
     

08 Oct, 2016

12 commits

  • Dan Williams
     
  • Dan Williams
     
  • The function dax_pmem_probe() in drivers/dax/pmem.c is compiled under the
    CONFIG_DEV_DAX_PMEM tri-state config option. This config option currently
    only depends on CONFIG_NVDIMM_DAX, a bool, which means that the following
    configuration is possible:

    CONFIG_LIBNVDIMM=m
    ...
    CONFIG_NVDIMM_DAX=y
    CONFIG_DEV_DAX=y
    CONFIG_DEV_DAX_PMEM=y

    With this config LIBNVDIMM is compiled as a module with NVDIMM_DAX=y just
    meaning that we will compile drivers/nvdimm/dax_devs.c into that module.
    However, dax_pmem_probe() depends on several symbols defined in
    drivers/nvdimm/dax_devs.c, which results in the following build errors:

    drivers/built-in.o: In function `dax_pmem_probe':
    linux/drivers/dax/pmem.c:70: undefined reference to `to_nd_dax'
    linux/drivers/dax/pmem.c:74: undefined reference to
    `nvdimm_namespace_common_probe'
    linux/drivers/dax/pmem.c:80: undefined reference to `devm_nsio_enable'
    linux/drivers/dax/pmem.c:81: undefined reference to `nvdimm_setup_pfn'
    linux/drivers/dax/pmem.c:84: undefined reference to `devm_nsio_disable'
    linux/drivers/dax/pmem.c:122: undefined reference to `to_nd_region'
    drivers/built-in.o: In function `dax_pmem_init':
    linux/drivers/dax/pmem.c:147: undefined reference to `__nd_driver_register'

    Fix this by making NVDIMM_DAX a tristate. DEV_DAX_PMEM depends on
    NVDIMM_DAX which depends on LIBNVDIMM. Since they are all now tristates,
    if LIBNVDIMM is built as a kernel module DEV_DAX_PMEM will be as well.
    This prevents dax_devs.c from being built as a built-in while its
    dependencies are in the libnvdimm.ko module.

    Signed-off-by: Ross Zwisler
    Signed-off-by: Dan Williams

    Ross Zwisler
     
  • Similar to BLK regions, publish new seed namespace devices to allow
    unused PMEM region capacity to be consumed by additional namespaces.

    Signed-off-by: Dan Williams

    Dan Williams
     
  • Now that the rest of the infrastructure has been converted to handle
    multi-pmem configurations, lift the artificial barrier at scan time.

    Signed-off-by: Dan Williams

    Dan Williams
     
  • Short-circuit doomed-to-fail label validation attempts by skipping
    labels that are outside the given region. For example a DIMM that has
    multiple PMEM regions will waste time attempting to create namespaces
    only to find that the interleave-set-cookie does not validate, e.g.:

    nd_region region6: invalid cookie in label: 73e608dc-47b9-4b2a-b5c7-2d55a32e0c2

    Similar to how we skip BLK labels when performing PMEM validation we can
    skip out-of-range labels early.

    Signed-off-by: Dan Williams

    Dan Williams
     
  • Now that we have nd_region_available_dpa() able to handle the presence
    of multiple PMEM allocations in aliased PMEM regions, reuse that same
    infrastructure to track allocations from free space. In particular
    handle allocating from an aliased PMEM region in the case where there
    are dis-contiguous holes. The allocation for BLK and PMEM are
    documented in the space_valid() helper:

    BLK-space is valid as long as it does not precede a PMEM
    allocation in a given region. PMEM-space must be contiguous
    and adjacent to an existing existing allocation (if one
    exists).

    Signed-off-by: Dan Williams

    Dan Williams
     
  • Instead of assuming that there will only ever be one allocated range at
    the start of the region, account for additional namespaces that might
    start at an offset from the region base.

    After this change pmem namespaces now have a reason to carry an array of
    resources similar to blk. Unifying the resource tracking infrastructure
    in nd_namespace_common is a future cleanup candidate.

    Signed-off-by: Dan Williams

    Dan Williams
     
  • pmem devices are currently named /dev/pmem. Preserve the
    naming of the 0th device, but add a "." for other
    devices.

    Signed-off-by: Dan Williams

    Dan Williams
     
  • The free dpa (dimm-physical-address) space calculation reports how much
    free space is available with consideration for aliased BLK + PMEM
    regions. Recall that BLK capacity is allocated from high addresses and
    PMEM is allocated from low addresses in their respective regions.

    nd_region_available_dpa() accounts for the fact that the largest
    encroachment (lowest starting address) into PMEM capacity by a BLK
    allocation limits the available capacity to that point, regardless if
    there is BLK allocation hole at a higher address. Similarly, for the
    multi-pmem case we need to track the largest encroachment (highest
    ending address) of a PMEM allocation in BLK capacity regardless of
    whether there is an allocation hole that a BLK allocation could fill at
    a lower address.

    Signed-off-by: Dan Williams

    Dan Williams
     
  • Add more determinism to initial namespace device-name assignments by
    sorting the namespaces by starting dpa.

    Signed-off-by: Dan Williams

    Dan Williams
     
  • If label scanning finds multiple valid pmem namespaces allow them to be
    surfaced rather than fail namespace scanning. Support for creating
    multiple namespaces per region is saved for a later patch.

    Note that this adds some new error messages to clarify which of the pmem
    namespaces in the set are potentially impacted by invalid labels.

    Signed-off-by: Dan Williams

    Dan Williams
     

06 Oct, 2016

2 commits


01 Oct, 2016

1 commit