22 Oct, 2015

4 commits

  • The libnvidmm-btt and nvme drivers use blk_integrity to reserve space
    for per-sector metadata, but sometimes without protection checksums.
    This property is generically useful, so teach the block core to
    internally specify a nop profile if one is not provided at registration
    time.

    Cc: Keith Busch
    Cc: Matthew Wilcox
    Suggested-by: Christoph Hellwig
    [hch: kill the local nvme nop profile as well]
    Acked-by: Martin K. Petersen
    Signed-off-by: Dan Williams
    Signed-off-by: Jens Axboe

    Dan Williams
     
  • Now that the integrity profile is statically allocated there is no work
    to do when shutting down an integrity enabled block device.

    Cc: Matthew Wilcox
    Cc: Mike Snitzer
    Cc: James Bottomley
    Acked-by: NeilBrown
    Acked-by: Keith Busch
    Acked-by: Vishal Verma
    Tested-by: Ross Zwisler
    Signed-off-by: Dan Williams
    Signed-off-by: Jens Axboe

    Dan Williams
     
  • Up until now the_integrity profile has been dynamically allocated and
    attached to struct gendisk after the disk has been made active.

    This causes problems because NVMe devices need to register the profile
    prior to the partition table being read due to a mandatory metadata
    buffer requirement. In addition, DM goes through hoops to deal with
    preallocating, but not initializing integrity profiles.

    Since the integrity profile is small (4 bytes + a pointer), Christoph
    suggested moving it to struct gendisk proper. This requires several
    changes:

    - Moving the blk_integrity definition to genhd.h.

    - Inlining blk_integrity in struct gendisk.

    - Removing the dynamic allocation code.

    - Adding helper functions which allow gendisk to set up and tear down
    the integrity sysfs dir when a disk is added/deleted.

    - Adding a blk_integrity_revalidate() callback for updating the stable
    pages bdi setting.

    - The calls that depend on whether a device has an integrity profile or
    not now key off of the bi->profile pointer.

    - Simplifying the integrity support routines in DM (Mike Snitzer).

    Signed-off-by: Martin K. Petersen
    Reported-by: Christoph Hellwig
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Mike Snitzer
    Cc: Dan Williams
    Signed-off-by: Dan Williams
    Signed-off-by: Jens Axboe

    Martin K. Petersen
     
  • We previously made a complete copy of a device's data integrity profile
    even though several of the fields inside the blk_integrity struct are
    pointers to fixed template entries in t10-pi.c.

    Split the static and per-device portions so that we can reference the
    template directly.

    Signed-off-by: Martin K. Petersen
    Reported-by: Christoph Hellwig
    Reviewed-by: Sagi Grimberg
    Cc: Dan Williams
    Signed-off-by: Dan Williams
    Signed-off-by: Jens Axboe

    Martin K. Petersen
     

17 Sep, 2015

3 commits


09 Sep, 2015

1 commit

  • Pull libnvdimm updates from Dan Williams:
    "This update has successfully completed a 0day-kbuild run and has
    appeared in a linux-next release. The changes outside of the typical
    drivers/nvdimm/ and drivers/acpi/nfit.[ch] paths are related to the
    removal of IORESOURCE_CACHEABLE, the introduction of memremap(), and
    the introduction of ZONE_DEVICE + devm_memremap_pages().

    Summary:

    - Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
    mechanism for adding device-driver-discovered memory regions to the
    kernel's direct map.

    This facility is used by the pmem driver to enable pfn_to_page()
    operations on the page frames returned by DAX ('direct_access' in
    'struct block_device_operations').

    For now, the 'memmap' allocation for these "device" pages comes
    from "System RAM". Support for allocating the memmap from device
    memory will arrive in a later kernel.

    - Introduce memremap() to replace usages of ioremap_cache() and
    ioremap_wt(). memremap() drops the __iomem annotation for these
    mappings to memory that do not have i/o side effects. The
    replacement of ioremap_cache() with memremap() is limited to the
    pmem driver to ease merging the api change in v4.3.

    Completion of the conversion is targeted for v4.4.

    - Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
    driver, update the VFS DAX implementation and PMEM api to provide
    persistence guarantees for kernel operations on a DAX mapping.

    - Convert the ACPI NFIT 'BLK' driver to map the block apertures as
    cacheable to improve performance.

    - Miscellaneous updates and fixes to libnvdimm including support for
    issuing "address range scrub" commands, clarifying the optimal
    'sector size' of pmem devices, a clarification of the usage of the
    ACPI '_STA' (status) property for DIMM devices, and other minor
    fixes"

    * tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (34 commits)
    libnvdimm, pmem: direct map legacy pmem by default
    libnvdimm, pmem: 'struct page' for pmem
    libnvdimm, pfn: 'struct page' provider infrastructure
    x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB
    add devm_memremap_pages
    mm: ZONE_DEVICE for "device memory"
    mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h
    dax: drop size parameter to ->direct_access()
    nd_blk: change aperture mapping from WC to WB
    nvdimm: change to use generic kvfree()
    pmem, dax: have direct_access use __pmem annotation
    dax: update I/O path to do proper PMEM flushing
    pmem: add copy_from_iter_pmem() and clear_pmem()
    pmem, x86: clean up conditional pmem includes
    pmem: remove layer when calling arch_has_wmb_pmem()
    pmem, x86: move x86 PMEM API to new pmem.h header
    libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option
    pmem: switch to devm_ allocations
    devres: add devm_memremap
    libnvdimm, btt: write and validate parent_uuid
    ...

    Linus Torvalds
     

03 Sep, 2015

1 commit

  • Pull core block updates from Jens Axboe:
    "This first core part of the block IO changes contains:

    - Cleanup of the bio IO error signaling from Christoph. We used to
    rely on the uptodate bit and passing around of an error, now we
    store the error in the bio itself.

    - Improvement of the above from myself, by shrinking the bio size
    down again to fit in two cachelines on x86-64.

    - Revert of the max_hw_sectors cap removal from a revision again,
    from Jeff Moyer. This caused performance regressions in various
    tests. Reinstate the limit, bump it to a more reasonable size
    instead.

    - Make /sys/block//queue/discard_max_bytes writeable, by me.
    Most devices have huge trim limits, which can cause nasty latencies
    when deleting files. Enable the admin to configure the size down.
    We will look into having a more sane default instead of UINT_MAX
    sectors.

    - Improvement of the SGP gaps logic from Keith Busch.

    - Enable the block core to handle arbitrarily sized bios, which
    enables a nice simplification of bio_add_page() (which is an IO hot
    path). From Kent.

    - Improvements to the partition io stats accounting, making it
    faster. From Ming Lei.

    - Also from Ming Lei, a basic fixup for overflow of the sysfs pending
    file in blk-mq, as well as a fix for a blk-mq timeout race
    condition.

    - Ming Lin has been carrying Kents above mentioned patches forward
    for a while, and testing them. Ming also did a few fixes around
    that.

    - Sasha Levin found and fixed a use-after-free problem introduced by
    the bio->bi_error changes from Christoph.

    - Small blk cgroup cleanup from Viresh Kumar"

    * 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 commits)
    blk: Fix bio_io_vec index when checking bvec gaps
    block: Replace SG_GAPS with new queue limits mask
    block: bump BLK_DEF_MAX_SECTORS to 2560
    Revert "block: remove artifical max_hw_sectors cap"
    blk-mq: fix race between timeout and freeing request
    blk-mq: fix buffer overflow when reading sysfs file of 'pending'
    Documentation: update notes in biovecs about arbitrarily sized bios
    block: remove bio_get_nr_vecs()
    fs: use helper bio_add_page() instead of open coding on bi_io_vec
    block: kill merge_bvec_fn() completely
    md/raid5: get rid of bio_fits_rdev()
    md/raid5: split bio for chunk_aligned_read
    block: remove split code in blkdev_issue_{discard,write_same}
    btrfs: remove bio splitting and merge_bvec_fn() calls
    bcache: remove driver private bio splitting code
    block: simplify bio_add_page()
    block: make generic_make_request handle arbitrarily sized bios
    blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL)
    block: don't access bio->bi_error after bio_put()
    block: shrink struct bio down to 2 cache lines again
    ...

    Linus Torvalds
     

29 Aug, 2015

3 commits

  • The expectation is that the legacy / non-standard pmem discovery method
    (e820 type-12) will only ever be used to describe small quantities of
    persistent memory. Larger capacities will be described via the ACPI
    NFIT. When "allocate struct page from pmem" support is added this default
    policy can be overridden by assigning a legacy pmem namespace to a pfn
    device, however this would be only be necessary if a platform used the
    legacy mechanism to define a very large range.

    Cc: Christoph Hellwig
    Signed-off-by: Dan Williams

    Dan Williams
     
  • Enable the pmem driver to handle PFN device instances. Attaching a pmem
    namespace to a pfn device triggers the driver to allocate and initialize
    struct page entries for pmem. Memory capacity for this allocation comes
    exclusively from RAM for now which is suitable for low PMEM to RAM
    ratios. This mechanism will be expanded later for setting an "allocate
    from PMEM" policy.

    Cc: Boaz Harrosh
    Cc: Ross Zwisler
    Cc: Christoph Hellwig
    Signed-off-by: Dan Williams

    Dan Williams
     
  • Implement the base infrastructure for libnvdimm PFN devices. Similar to
    BTT devices they take a namespace as a backing device and layer
    functionality on top. In this case the functionality is reserving space
    for an array of 'struct page' entries to be handed out through
    pfn_to_page(). For now this is just the basic libnvdimm-device-model for
    configuring the base PFN device.

    As the namespace claiming mechanism for PFN devices is mostly identical
    to BTT devices drivers/nvdimm/claim.c is created to house the common
    bits.

    Cc: Ross Zwisler
    Signed-off-by: Dan Williams

    Dan Williams
     

28 Aug, 2015

4 commits

  • Given that a write-back (WB) mapping plus non-temporal stores is
    expected to be the most efficient way to access PMEM, update the
    definition of ARCH_HAS_PMEM_API to imply arch support for
    WB-mapped-PMEM. This is needed as a pre-requisite for adding PMEM to
    the direct map and mapping it with struct page.

    The above clarification for X86_64 means that memcpy_to_pmem() is
    permitted to use the non-temporal arch_memcpy_to_pmem() rather than
    needlessly fall back to default_memcpy_to_pmem() when the pcommit
    instruction is not available. When arch_memcpy_to_pmem() is not
    guaranteed to flush writes out of cache, i.e. on older X86_32
    implementations where non-temporal stores may just dirty cache,
    ARCH_HAS_PMEM_API is simply disabled.

    The default fall back for persistent memory handling remains. Namely,
    map it with the WT (write-through) cache-type and hope for the best.

    arch_has_pmem_api() is updated to only indicate whether the arch
    provides the proper helpers to meet the minimum "writes are visible
    outside the cache hierarchy after memcpy_to_pmem() + wmb_pmem()". Code
    that cares whether wmb_pmem() actually flushes writes to pmem must now
    call arch_has_wmb_pmem() directly.

    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Reviewed-by: Ross Zwisler
    [hch: set ARCH_HAS_PMEM_API=n on x86_32]
    Reviewed-by: Christoph Hellwig
    [toshi: x86_32 compile fixes]
    Signed-off-by: Toshi Kani
    Signed-off-by: Dan Williams

    Dan Williams
     
  • None of the implementations currently use it. The common
    bdev_direct_access() entry point handles all the size checks before
    calling ->direct_access().

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Dan Williams

    Dan Williams
     
  • Dan Williams
     
  • Signed-off-by: yalin wang
    Reviewed-by: Ross Zwisler
    Signed-off-by: Dan Williams

    yalin wang
     

21 Aug, 2015

1 commit

  • Update the annotation for the kaddr pointer returned by direct_access()
    so that it is a __pmem pointer. This is consistent with the PMEM driver
    and with how this direct_access() pointer is used in the DAX code.

    Signed-off-by: Ross Zwisler
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Dan Williams

    Ross Zwisler
     

19 Aug, 2015

1 commit

  • We currently register a platform device for e820 type-12 memory and
    register a nvdimm bus beneath it. Registering the platform device
    triggers the device-core machinery to probe for a driver, but that
    search currently comes up empty. Building the nvdimm-bus registration
    into the e820_pmem platform device registration in this way forces
    libnvdimm to be built-in. Instead, convert the built-in portion of
    CONFIG_X86_PMEM_LEGACY to simply register a platform device and move the
    rest of the logic to the driver for e820_pmem, for the following
    reasons:

    1/ Letting e820_pmem support be a module allows building and testing
    libnvdimm.ko changes without rebooting

    2/ All the normal policy around modules can be applied to e820_pmem
    (unbind to disable and/or blacklisting the module from loading by
    default)

    3/ Moving the driver to a generic location and converting it to scan
    "iomem_resource" rather than "e820.map" means any other architecture can
    take advantage of this simple nvdimm resource discovery mechanism by
    registering a resource named "Persistent Memory (legacy)"

    Cc: Christoph Hellwig
    Signed-off-by: Dan Williams

    Dan Williams
     

15 Aug, 2015

4 commits

  • Signed-off-by: Christoph Hellwig
    [djbw: tools/testing/nvdimm/ and memunmap_pmem support]
    Reviewed-by: Ross Zwisler
    Signed-off-by: Dan Williams

    Christoph Hellwig
     
  • When a BTT is instantiated on a namespace it must validate the namespace
    uuid matches the 'parent_uuid' stored in the btt superblock. This
    property enforces that changing the namespace UUID invalidates all
    former BTT instances on that storage. For "IO namespaces" that don't
    have a label or UUID, the parent_uuid is set to zero, and this
    validation is skipped. For such cases, old BTTs have to be invalidated
    by forcing the namespace to raw mode, and overwriting the BTT info
    blocks.

    Based on a patch by Dan Williams

    Signed-off-by: Vishal Verma
    Signed-off-by: Dan Williams

    Vishal Verma
     
  • Use arena_is_valid as a common routine for checking the validity of an
    info block from both discover_arenas, and nd_btt_probe.

    As a result, don't check for validity of the BTT's UUID, and lbasize.
    The checksum in the BTT info block guarantees self-consistency, and when
    we're called from nd_btt_probe, we don't have a valid uuid or lbasize
    available to check against.

    Also cleanup to return a bool instead of an int.

    Signed-off-by: Vishal Verma
    Signed-off-by: Dan Williams

    Vishal Verma
     
  • Consolidate the parameters passed to arena_is_valid into just nd_btt,
    and an info block to increase re-usability.

    Similarly, btt_arena_write_layout doesn't need to be passed a uuid, as
    it can be obtained from arena->nd_btt.

    Signed-off-by: Vishal Verma
    Signed-off-by: Dan Williams

    Vishal Verma
     

01 Aug, 2015

1 commit

  • Fix multiple build warnings when CONFIG_BTT is not enabled:

    In file included from ../drivers/nvdimm/bus.c:29:0:
    ../drivers/nvdimm/nd.h:169:15: warning: return type defaults to 'int' [-Wreturn-type]
    static inline nd_btt_probe(struct nd_namespace_common *ndns, void *drvdata)
    ^

    Signed-off-by: Randy Dunlap
    Cc: Dan Williams
    Cc: linux-nvdimm@lists.01.org
    Signed-off-by: Dan Williams

    Randy Dunlap
     

29 Jul, 2015

1 commit

  • Currently we have two different ways to signal an I/O error on a BIO:

    (1) by clearing the BIO_UPTODATE flag
    (2) by returning a Linux errno value to the bi_end_io callback

    The first one has the drawback of only communicating a single possible
    error (-EIO), and the second one has the drawback of not beeing persistent
    when bios are queued up, and are not passed along from child to parent
    bio in the ever more popular chaining scenario. Having both mechanisms
    available has the additional drawback of utterly confusing driver authors
    and introducing bugs where various I/O submitters only deal with one of
    them, and the others have to add boilerplate code to deal with both kinds
    of error returns.

    So add a new bi_error field to store an errno value directly in struct
    bio and remove the existing mechanisms to clean all this up.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Hannes Reinecke
    Reviewed-by: NeilBrown
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

28 Jul, 2015

2 commits

  • Based on a patch: c8fa317 brd: Request from fdisk 4k alignment by Boaz
    Harrosh, allow fdisk to create properly aligned partitions for DAX. This
    will also cause mkfs.ext4 to emit a warning if using a file system block
    size of less than PAGE_SIZE.

    Cc: Dan Williams
    Cc: Ross Zwisler
    Cc: Matthew Wilcox
    Cc: Christoph Hellwig
    Cc: Elliott, Robert
    Signed-off-by: Vishal Verma
    Acked-by: Boaz Harrosh
    Acked-by: Ross Zwisler
    Signed-off-by: Dan Williams

    Vishal Verma
     
  • Fix:
    drivers/nvdimm/btt.c:635:29: warning: restricted __le64 degrades to integer

    Signed-off-by: Dan Williams

    Dan Williams
     

26 Jul, 2015

1 commit

  • A new BLK namespace "seed" device is created whenever the current seed
    is successfully probed. However, if that namespace is assigned to a BTT
    it may never directly experience a successful probe as it is a
    subordinate device to a BTT configuration.

    The effect of the current code is that no new namespaces can be
    instantiated, after the seed namespace, to consume available BLK DPA
    capacity. Fix this by treating a successful BTT probe event as a
    successful probe event for the backing namespace.

    Reported-by: Nicholas Moulin
    Signed-off-by: Dan Williams

    Dan Williams
     

01 Jul, 2015

2 commits


26 Jun, 2015

11 commits

  • Based on an original patch by Ross Zwisler [1].

    Writes to persistent memory have the potential to be posted to cpu
    cache, cpu write buffers, and platform write buffers (memory controller)
    before being committed to persistent media. Provide apis,
    memcpy_to_pmem(), wmb_pmem(), and memremap_pmem(), to write data to
    pmem and assert that it is durable in PMEM (a persistent linear address
    range). A '__pmem' attribute is added so sparse can track proper usage
    of pointers to pmem.

    This continues the status quo of pmem being x86 only for 4.2, but
    reworks to ioremap, and wider implementation of memremap() will enable
    other archs in 4.3.

    [1]: https://lists.01.org/pipermail/linux-nvdimm/2015-May/000932.html

    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Signed-off-by: Ross Zwisler
    [djbw: various reworks]
    Signed-off-by: Dan Williams

    Ross Zwisler
     
  • Add support of sysfs 'numa_node' to I/O-related NVDIMM devices
    under /sys/bus/nd/devices, regionN, namespaceN.0, and bttN.x.

    An example of numa_node values on a 2-socket system with a single
    NVDIMM range on each socket is shown below.
    /sys/bus/nd/devices
    |-- btt0.0/numa_node:0
    |-- btt1.0/numa_node:1
    |-- btt1.1/numa_node:1
    |-- namespace0.0/numa_node:0
    |-- namespace1.0/numa_node:1
    |-- region0/numa_node:0
    |-- region1/numa_node:1

    These numa_node files are then linked under the block class of
    their device names.
    /sys/class/block/pmem0/device/numa_node:0
    /sys/class/block/pmem1s/device/numa_node:1

    This enables numactl(8) to accept 'block:' and 'file:' paths of
    pmem and btt devices as shown in the examples below.
    numactl --preferred block:pmem0 --show
    numactl --preferred file:/dev/pmem1s --show

    Signed-off-by: Toshi Kani
    Signed-off-by: Dan Williams

    Toshi Kani
     
  • ACPI NFIT table has System Physical Address Range Structure entries that
    describe a proximity ID of each range when ACPI_NFIT_PROXIMITY_VALID is
    set in the flags.

    Change acpi_nfit_register_region() to map a proximity ID to its node ID,
    and set it to a new numa_node field of nd_region_desc, which is then
    conveyed to the nd_region device.

    The device core arranges for btt and namespace devices to inherit their
    node from their parent region.

    Signed-off-by: Toshi Kani
    [djbw: move set_dev_node() from region.c to bus.c]
    Signed-off-by: Dan Williams

    Toshi Kani
     
  • Upon detection of an unarmed dimm in a region, arrange for descendant
    BTT, PMEM, or BLK instances to be read-only. A dimm is primarily marked
    "unarmed" via flags passed by platform firmware (NFIT).

    The flags in the NFIT memory device sub-structure indicate the state of
    the data on the nvdimm relative to its energy source or last "flush to
    persistence". For the most part there is nothing the driver can do but
    advertise the state of these flags in sysfs and emit a message if
    firmware indicates that the contents of the device may be corrupted.
    However, for the case of ACPI_NFIT_MEM_ARMED, the driver can arrange for
    the block devices incorporating that nvdimm to be marked read-only.
    This is a safe default as the data is still available and new writes are
    held off until the administrator either forces read-write mode, or the
    energy source becomes armed.

    A 'read_only' attribute is added to REGION devices to allow for
    overriding the default read-only policy of all descendant block devices.

    Signed-off-by: Dan Williams

    Dan Williams
     
  • ...since they are effectively SSDs as far as userspace is concerned.

    Reviewed-by: Vishal Verma
    Signed-off-by: Dan Williams

    Dan Williams
     
  • This is disabled by default as the overhead is prohibitive, but if the
    user takes the action to turn it on we'll oblige.

    Reviewed-by: Vishal Verma
    Signed-off-by: Dan Williams

    Dan Williams
     
  • Various cleanups:

    1/ Kill the BUG_ON since we've already told the block layer we don't
    support DISCARD on all these drivers.

    2/ Kill the 'rw' variable, no need to cache it.

    3/ Kill the local 'sector' variable. bio_for_each_segment() is already
    advancing the iterator's sector number by the bio_vec length.

    4/ Kill the check for accessing past the end of device
    generic_make_request_checks() already does that.

    Suggested-by: Christoph Hellwig
    [hch: kill access past end of the device check]
    Reviewed-by: Vishal Verma
    Signed-off-by: Dan Williams

    Dan Williams
     
  • There is no hardware limit to enforce on the size of the i/o that can be passed
    to an nvdimm block device, so set it to UINT_MAX.

    Reviewed-by: Vishal Verma
    Signed-off-by: Dan Williams

    Dan Williams
     
  • Support multiple block sizes (sector + metadata) for nd_blk in the
    same way as done for the BTT. Add the idea of an 'internal' lbasize,
    which is properly aligned and padded, and store metadata in this space.

    Signed-off-by: Vishal Verma
    Signed-off-by: Dan Williams

    Vishal Verma
     
  • Support multiple block sizes (sector + metadata) using the blk integrity
    framework. This registers a new integrity template that defines the
    protection information tuple size based on the configured metadata size,
    and simply acts as a passthrough for protection information generated by
    another layer. The metadata is written to the storage as-is, and read back
    with each sector.

    Signed-off-by: Vishal Verma
    Signed-off-by: Dan Williams

    Vishal Verma
     
  • The libnvdimm implementation handles allocating dimm address space (DPA)
    between PMEM and BLK mode interfaces. After DPA has been allocated from
    a BLK-region to a BLK-namespace the nd_blk driver attaches to handle I/O
    as a struct bio based block device. Unlike PMEM, BLK is required to
    handle platform specific details like mmio register formats and memory
    controller interleave. For this reason the libnvdimm generic nd_blk
    driver calls back into the bus provider to carry out the I/O.

    This initial implementation handles the BLK interface defined by the
    ACPI 6 NFIT [1] and the NVDIMM DSM Interface Example [2] composed from
    DCR (dimm control region), BDW (block data window), IDT (interleave
    descriptor) NFIT structures and the hardware register format.
    [1]: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
    [2]: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf

    Cc: Andy Lutomirski
    Cc: Boaz Harrosh
    Cc: H. Peter Anvin
    Cc: Jens Axboe
    Cc: Ingo Molnar
    Cc: Christoph Hellwig
    Signed-off-by: Ross Zwisler
    Acked-by: Rafael J. Wysocki
    Signed-off-by: Dan Williams

    Ross Zwisler