Eric Lee / smarc-fsl-linux-kernel

09 Aug, 2016

1 commit

abe8b4e3c nvdimm, btt: add a size attribute for BTTs ... Browse Code »

To be consistent with other namespaces, expose a 'size' attribute for
BTT devices also.

Cc: Dan Williams
Reported-by: Linda Knippers
Signed-off-by: Vishal Verma
Signed-off-by: Dan Williams

Vishal Verma
2016-08-09 00:26:14 +0800

08 Aug, 2016

1 commit

c11f0c0b5 block/mm: make bdev_ops->rw_page() take a bool for read/write ... Browse Code »

Commit abf545484d31 changed it from an 'rw' flags type to the
newer ops based interface, but now we're effectively leaking
some bdev internals to the rest of the kernel. Since we only
care about whether it's a read or a write at that level, just
pass in a bool 'is_write' parameter instead.

Then we can also move op_is_write() and friends back under
CONFIG_BLOCK protection.

Reviewed-by: Mike Christie
Signed-off-by: Jens Axboe

Jens Axboe
2016-08-08 04:41:02 +0800

05 Aug, 2016

1 commit

abf545484 mm/block: convert rw_page users to bio op use ... Browse Code »

The rw_page users were not converted to use bio/req ops. As a result
bdev_write_page is not passing down REQ_OP_WRITE and the IOs will
be sent down as reads.

Signed-off-by: Mike Christie
Fixes: 4e1b2d52a80d ("block, fs, drivers: remove REQ_OP compat defs and related code")

Modified by me to:

1) Drop op_flags passing into ->rw_page(), as we don't use it.
2) Make op_is_write() and friends safe to use for !CONFIG_BLOCK

Signed-off-by: Jens Axboe

Mike Christie
2016-08-05 04:25:33 +0800

28 Jun, 2016

1 commit

0d52c756a block: convert to device_add_disk() ... Browse Code »

For block drivers that specify a parent device, convert them to use
device_add_disk().

This conversion was done with the following semantic patch:

@@
struct gendisk *disk;
expression E;
@@

- disk->driverfs_dev = E;
...
- add_disk(disk);
+ device_add_disk(E, disk);

@@
struct gendisk *disk;
expression E1, E2;
@@

- disk->driverfs_dev = E1;
...
E2 = disk;
...
- add_disk(E2);
+ device_add_disk(E1, E2);

...plus some manual fixups for a few missed conversions.

Cc: Jens Axboe
Cc: Keith Busch
Cc: Michael S. Tsirkin
Cc: David Woodhouse
Cc: David S. Miller
Cc: James Bottomley
Cc: Ross Zwisler
Cc: Konrad Rzeszutek Wilk
Cc: Martin K. Petersen
Reviewed-by: Christoph Hellwig
Reviewed-by: Johannes Thumshirn
Signed-off-by: Dan Williams

Dan Williams
2016-06-28 03:26:08 +0800

19 May, 2016

1 commit

2159669f5 Merge branch 'for-4.7/libnvdimm' into libnvdimm-for-next Browse Code »

Dan Williams
2016-05-19 01:06:48 +0800

23 Apr, 2016

3 commits

9dec4892c libnvdimm, btt: add btt startup debug ... Browse Code »

Report the reason for btt probe failures when debug is enabled.

Signed-off-by: Dan Williams

Dan Williams
2016-04-23 03:26:05 +0800
e32bc729a libnvdimm, btt, convert nd_btt_probe() to devm ... Browse Code »

Pass the device performing the probe so we can use a devm allocation for
the btt superblock.

Cc: Vishal Verma
Reviewed-by: Johannes Thumshirn
Signed-off-by: Dan Williams

Dan Williams
2016-04-23 01:59:54 +0800
298f2bc5d libnvdimm, pmem: kill pmem->ndns ... Browse Code »

We can derive the common namespace from other information. We also do
not need to cache it because all the usages are in slow paths.

Reviewed-by: Johannes Thumshirn
Signed-off-by: Dan Williams

Dan Williams
2016-04-23 01:59:54 +0800

05 Apr, 2016

1 commit

09cbfeaf1 mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros ... Browse Code »

PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.

This promise never materialized. And unlikely will.

We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.

Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.

Let's stop pretending that pages in page cache are special. They are
not.

The changes are pretty straight-forward:

- << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

- >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

- page_cache_get() -> get_page();

- page_cache_release() -> put_page();

This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.

The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.

There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.

virtual patch

@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT

@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE

@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK

@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)

@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)

@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)

Signed-off-by: Kirill A. Shutemov
Acked-by: Michal Hocko
Signed-off-by: Linus Torvalds

Kirill A. Shutemov
2016-04-05 01:41:08 +0800

10 Mar, 2016

1 commit

ff8e92d5d nvdimm/btt: don't allocate unused major device number ... Browse Code »

alloc_disk(0) does not require or use a ->major number,
all devices are allocated with a major of BLOCK_EXT_MAJOR.

So don't allocate btt_major.

Signed-off-by: NeilBrown
Signed-off-by: Dan Williams

NeilBrown
2016-03-10 07:00:24 +0800

08 Nov, 2015

1 commit

dece16353 block: change ->make_request_fn() and users to return a queue cookie ... Browse Code »

No functional changes in this patch, but it prepares us for returning
a more useful cookie related to the IO that was queued up.

Signed-off-by: Jens Axboe
Acked-by: Christoph Hellwig
Acked-by: Keith Busch

Jens Axboe
2015-11-08 01:40:46 +0800

22 Oct, 2015

1 commit

9609b9942 md, dm, scsi, nvme, libnvdimm: drop blk_integrity_unregister() at shutdown ... Browse Code »

Now that the integrity profile is statically allocated there is no work
to do when shutting down an integrity enabled block device.

Cc: Matthew Wilcox
Cc: Mike Snitzer
Cc: James Bottomley
Acked-by: NeilBrown
Acked-by: Keith Busch
Acked-by: Vishal Verma
Tested-by: Ross Zwisler
Signed-off-by: Dan Williams
Signed-off-by: Jens Axboe

Dan Williams
2015-10-22 04:43:37 +0800

09 Sep, 2015

1 commit

12f03ee60 Merge tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm ... Browse Code »

Pull libnvdimm updates from Dan Williams:
"This update has successfully completed a 0day-kbuild run and has
appeared in a linux-next release. The changes outside of the typical
drivers/nvdimm/ and drivers/acpi/nfit.[ch] paths are related to the
removal of IORESOURCE_CACHEABLE, the introduction of memremap(), and
the introduction of ZONE_DEVICE + devm_memremap_pages().

Summary:

- Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
mechanism for adding device-driver-discovered memory regions to the
kernel's direct map.

This facility is used by the pmem driver to enable pfn_to_page()
operations on the page frames returned by DAX ('direct_access' in
'struct block_device_operations').

For now, the 'memmap' allocation for these "device" pages comes
from "System RAM". Support for allocating the memmap from device
memory will arrive in a later kernel.

- Introduce memremap() to replace usages of ioremap_cache() and
ioremap_wt(). memremap() drops the __iomem annotation for these
mappings to memory that do not have i/o side effects. The
replacement of ioremap_cache() with memremap() is limited to the
pmem driver to ease merging the api change in v4.3.

Completion of the conversion is targeted for v4.4.

- Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
driver, update the VFS DAX implementation and PMEM api to provide
persistence guarantees for kernel operations on a DAX mapping.

- Convert the ACPI NFIT 'BLK' driver to map the block apertures as
cacheable to improve performance.

- Miscellaneous updates and fixes to libnvdimm including support for
issuing "address range scrub" commands, clarifying the optimal
'sector size' of pmem devices, a clarification of the usage of the
ACPI '_STA' (status) property for DIMM devices, and other minor
fixes"

* tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (34 commits)
libnvdimm, pmem: direct map legacy pmem by default
libnvdimm, pmem: 'struct page' for pmem
libnvdimm, pfn: 'struct page' provider infrastructure
x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB
add devm_memremap_pages
mm: ZONE_DEVICE for "device memory"
mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h
dax: drop size parameter to ->direct_access()
nd_blk: change aperture mapping from WC to WB
nvdimm: change to use generic kvfree()
pmem, dax: have direct_access use __pmem annotation
dax: update I/O path to do proper PMEM flushing
pmem: add copy_from_iter_pmem() and clear_pmem()
pmem, x86: clean up conditional pmem includes
pmem: remove layer when calling arch_has_wmb_pmem()
pmem, x86: move x86 PMEM API to new pmem.h header
libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option
pmem: switch to devm_ allocations
devres: add devm_memremap
libnvdimm, btt: write and validate parent_uuid
...

Linus Torvalds
2015-09-09 05:35:59 +0800

29 Aug, 2015

1 commit

e1455744b libnvdimm, pfn: 'struct page' provider infrastructure ... Browse Code »

Implement the base infrastructure for libnvdimm PFN devices. Similar to
BTT devices they take a namespace as a backing device and layer
functionality on top. In this case the functionality is reserving space
for an array of 'struct page' entries to be handed out through
pfn_to_page(). For now this is just the basic libnvdimm-device-model for
configuring the base PFN device.

As the namespace claiming mechanism for PFN devices is mostly identical
to BTT devices drivers/nvdimm/claim.c is created to house the common
bits.

Cc: Ross Zwisler
Signed-off-by: Dan Williams

Dan Williams
2015-08-29 11:39:36 +0800

15 Aug, 2015

3 commits

6ec689542 libnvdimm, btt: write and validate parent_uuid ... Browse Code »

When a BTT is instantiated on a namespace it must validate the namespace
uuid matches the 'parent_uuid' stored in the btt superblock. This
property enforces that changing the namespace UUID invalidates all
former BTT instances on that storage. For "IO namespaces" that don't
have a label or UUID, the parent_uuid is set to zero, and this
validation is skipped. For such cases, old BTTs have to be invalidated
by forcing the namespace to raw mode, and overwriting the BTT info
blocks.

Based on a patch by Dan Williams

Signed-off-by: Vishal Verma
Signed-off-by: Dan Williams

Vishal Verma
2015-08-15 01:43:04 +0800
ab45e7632 libnvdimm, btt: consolidate arena validation ... Browse Code »

Use arena_is_valid as a common routine for checking the validity of an
info block from both discover_arenas, and nd_btt_probe.

As a result, don't check for validity of the BTT's UUID, and lbasize.
The checksum in the BTT info block guarantees self-consistency, and when
we're called from nd_btt_probe, we don't have a valid uuid or lbasize
available to check against.

Also cleanup to return a bool instead of an int.

Signed-off-by: Vishal Verma
Signed-off-by: Dan Williams

Vishal Verma
2015-08-15 01:43:04 +0800
fbde1414a libnvdimm, btt: clean up internal interfaces ... Browse Code »

Consolidate the parameters passed to arena_is_valid into just nd_btt,
and an info block to increase re-usability.

Similarly, btt_arena_write_layout doesn't need to be passed a uuid, as
it can be obtained from arena->nd_btt.

Signed-off-by: Vishal Verma
Signed-off-by: Dan Williams

Vishal Verma
2015-08-15 01:43:04 +0800

29 Jul, 2015

1 commit

4246a0b63 block: add a bi_error field to struct bio ... Browse Code »

Currently we have two different ways to signal an I/O error on a BIO:

(1) by clearing the BIO_UPTODATE flag
(2) by returning a Linux errno value to the bi_end_io callback

The first one has the drawback of only communicating a single possible
error (-EIO), and the second one has the drawback of not beeing persistent
when bios are queued up, and are not passed along from child to parent
bio in the ever more popular chaining scenario. Having both mechanisms
available has the additional drawback of utterly confusing driver authors
and introducing bugs where various I/O submitters only deal with one of
them, and the others have to add boilerplate code to deal with both kinds
of error returns.

So add a new bi_error field to store an errno value directly in struct
bio and remove the existing mechanisms to clean all this up.

Signed-off-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
Reviewed-by: NeilBrown
Signed-off-by: Jens Axboe

Christoph Hellwig
2015-07-29 22:55:15 +0800

28 Jul, 2015

1 commit

5e3294062 libnvdimm, btt: sparse fix ... Browse Code »

Fix:
drivers/nvdimm/btt.c:635:29: warning: restricted __le64 degrades to integer

Signed-off-by: Dan Williams

Dan Williams
2015-07-28 10:53:19 +0800

26 Jun, 2015

4 commits

581388209 libnvdimm, nfit: handle unarmed dimms, mark namespaces read-only ... Browse Code »

Upon detection of an unarmed dimm in a region, arrange for descendant
BTT, PMEM, or BLK instances to be read-only. A dimm is primarily marked
"unarmed" via flags passed by platform firmware (NFIT).

The flags in the NFIT memory device sub-structure indicate the state of
the data on the nvdimm relative to its energy source or last "flush to
persistence". For the most part there is nothing the driver can do but
advertise the state of these flags in sysfs and emit a message if
firmware indicates that the contents of the device may be corrupted.
However, for the case of ACPI_NFIT_MEM_ARMED, the driver can arrange for
the block devices incorporating that nvdimm to be marked read-only.
This is a safe default as the data is still available and new writes are
held off until the administrator either forces read-write mode, or the
energy source becomes armed.

A 'read_only' attribute is added to REGION devices to allow for
overriding the default read-only policy of all descendant block devices.

Signed-off-by: Dan Williams

Dan Williams
2015-06-26 23:23:38 +0800
f0dc089ce libnvdimm: enable iostat ... Browse Code »

This is disabled by default as the overhead is prohibitive, but if the
user takes the action to turn it on we'll oblige.

Reviewed-by: Vishal Verma
Signed-off-by: Dan Williams

Dan Williams
2015-06-26 23:23:38 +0800
41cd8b70c libnvdimm, btt: add support for blk integrity ... Browse Code »

Support multiple block sizes (sector + metadata) using the blk integrity
framework. This registers a new integrity template that defines the
protection information tuple size based on the configured metadata size,
and simply acts as a passthrough for protection information generated by
another layer. The metadata is written to the storage as-is, and read back
with each sector.

Signed-off-by: Vishal Verma
Signed-off-by: Dan Williams

Vishal Verma
2015-06-26 23:23:38 +0800
5212e11fd nd_btt: atomic sector updates ... Browse Code »

BTT stands for Block Translation Table, and is a way to provide power
fail sector atomicity semantics for block devices that have the ability
to perform byte granularity IO. It relies on the capability of libnvdimm
namespace devices to do byte aligned IO.

The BTT works as a stacked blocked device, and reserves a chunk of space
from the backing device for its accounting metadata. It is a bio-based
driver because all IO is done synchronously, and there is no queuing or
asynchronous completions at either the device or the driver level.

The BTT uses 'lanes' to index into various 'on-disk' data structures,
and lanes also act as a synchronization mechanism in case there are more
CPUs than available lanes. We did a comparison between two lane lock
strategies - first where we kept an atomic counter around that tracked
which was the last lane that was used, and 'our' lane was determined by
atomically incrementing that. That way, for the nr_cpus > nr_lanes case,
theoretically, no CPU would be blocked waiting for a lane. The other
strategy was to use the cpu number we're scheduled on to and hash it to
a lane number. Theoretically, this could block an IO that could've
otherwise run using a different, free lane. But some fio workloads
showed that the direct cpu -> lane hash performed faster than tracking
'last lane' - my reasoning is the cache thrash caused by moving the
atomic variable made that approach slower than simply waiting out the
in-progress IO. This supports the conclusion that the driver can be a
very simple bio-based one that does synchronous IOs instead of queuing.

Cc: Andy Lutomirski
Cc: Boaz Harrosh
Cc: H. Peter Anvin
Cc: Jens Axboe
Cc: Ingo Molnar
Cc: Christoph Hellwig
Cc: Neil Brown
Cc: Jeff Moyer
Cc: Dave Chinner
Cc: Greg KH
[jmoyer: fix nmi watchdog timeout in btt_map_init]
[jmoyer: move btt initialization to module load path]
[jmoyer: fix memory leak in the btt initialization path]
[jmoyer: Don't overwrite corrupted arenas]
Signed-off-by: Vishal Verma
Signed-off-by: Dan Williams

Vishal Verma
2015-06-26 23:23:38 +0800