Eric Lee / smarc-fsl-linux-kernel

08 Oct, 2020

1 commit

9c37de297 dm: remove special-casing of bio-based immutable singleton target on NVMe ... Browse Code »

Since commit 5a6c35f9af416 ("block: remove direct_make_request") there
is no benefit to DM special-casing NVMe. Remove all code used to
establish DM_TYPE_NVME_BIO_BASED.

Signed-off-by: Mike Snitzer

Mike Snitzer
2020-10-08 06:08:41 +0800

25 Sep, 2020

1 commit

6abc49468 dm: add support for REQ_NOWAIT and enable it for linear target ... Browse Code »

Add DM target feature flag DM_TARGET_NOWAIT which advertises that
target works with REQ_NOWAIT bios.

Add dm_table_supports_nowait() and update dm_table_set_restrictions()
to set/clear QUEUE_FLAG_NOWAIT accordingly.

Signed-off-by: Konstantin Khlebnikov
Signed-off-by: Mike Snitzer
Signed-off-by: Jens Axboe

Konstantin Khlebnikov
2020-09-25 22:20:03 +0800

04 Aug, 2020

1 commit

382625d0d Merge tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-block ... Browse Code »

Pull core block updates from Jens Axboe:
"Good amount of cleanups and tech debt removals in here, and as a
result, the diffstat shows a nice net reduction in code.

- Softirq completion cleanups (Christoph)

- Stop using ->queuedata (Christoph)

- Cleanup bd claiming (Christoph)

- Use check_events, moving away from the legacy media change
(Christoph)

- Use inode i_blkbits consistently (Christoph)

- Remove old unused writeback congestion bits (Christoph)

- Cleanup/unify submission path (Christoph)

- Use bio_uninit consistently, instead of bio_disassociate_blkg
(Christoph)

- sbitmap cleared bits handling (John)

- Request merging blktrace event addition (Jan)

- sysfs add/remove race fixes (Luis)

- blk-mq tag fixes/optimizations (Ming)

- Duplicate words in comments (Randy)

- Flush deferral cleanup (Yufen)

- IO context locking/retry fixes (John)

- struct_size() usage (Gustavo)

- blk-iocost fixes (Chengming)

- blk-cgroup IO stats fixes (Boris)

- Various little fixes"

* tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-block: (135 commits)
block: blk-timeout: delete duplicated word
block: blk-mq-sched: delete duplicated word
block: blk-mq: delete duplicated word
block: genhd: delete duplicated words
block: elevator: delete duplicated word and fix typos
block: bio: delete duplicated words
block: bfq-iosched: fix duplicated word
iocost_monitor: start from the oldest usage index
iocost: Fix check condition of iocg abs_vdebt
block: Remove callback typedefs for blk_mq_ops
block: Use non _rcu version of list functions for tag_set_list
blk-cgroup: show global disk stats in root cgroup io.stat
blk-cgroup: make iostat functions visible to stat printing
block: improve discard bio alignment in __blkdev_issue_discard()
block: change REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL to be odd numbers
block: defer flush request no matter whether we have elevator
block: make blk_timeout_init() static
block: remove retry loop in ioc_release_fn()
block: remove unnecessary ioc nested locking
block: integrate bd_start_claiming into __blkdev_get
...

Linus Torvalds
2020-08-04 02:57:03 +0800

24 Jul, 2020

1 commit

5df96f2b9 dm integrity: fix integrity recalculation that is improperly skipped ... Browse Code »

Commit adc0daad366b62ca1bce3e2958a40b0b71a8b8b3 ("dm: report suspended
device during destroy") broke integrity recalculation.

The problem is dm_suspended() returns true not only during suspend,
but also during resume. So this race condition could occur:
1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work)
2. integrity_recalc (&ic->recalc_work) preempts the current thread
3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret;
4. integrity_recalc exits and no recalculating is done.

To fix this race condition, add a function dm_post_suspending that is
only true during the postsuspend phase and use it instead of
dm_suspended().

Signed-off-by: Mikulas Patocka
Fixes: adc0daad366b ("dm: report suspended device during destroy")
Cc: stable vger kernel org # v4.18+
Signed-off-by: Mike Snitzer

Mikulas Patocka
2020-07-24 02:39:37 +0800

09 Jul, 2020

1 commit

21cf86614 writeback: remove bdi->congested_fn ... Browse Code »

Except for pktdvd, the only places setting congested bits are file
systems that allocate their own backing_dev_info structures. And
pktdvd is a deprecated driver that isn't useful in stack setup
either. So remove the dead congested_fn stacking infrastructure.

Signed-off-by: Christoph Hellwig
Acked-by: Song Liu
Acked-by: David Sterba
[axboe: fixup unused variables in bcache/request.c]
Signed-off-by: Jens Axboe

Christoph Hellwig
2020-07-09 07:20:46 +0800

21 May, 2020

1 commit

74244b59a dm: use dynamic debug instead of compile-time config option ... Browse Code »

Switch to use dynamic debug to avoid having recompile the kernel
just to enable debugging messages.

Signed-off-by: Hannes Reinecke
Signed-off-by: Mike Snitzer

Hannes Reinecke
2020-05-21 05:09:45 +0800

15 May, 2020

1 commit

087615bf3 dm mpath: pass IO start time to path selector ... Browse Code »

The HST path selector needs this information to perform path
prediction. For request-based mpath, struct request's io_start_time_ns
is used, while for bio-based, use the start_time stored in dm_io.

Signed-off-by: Gabriel Krisman Bertazi
Signed-off-by: Mike Snitzer

Gabriel Krisman Bertazi
2020-05-15 22:29:36 +0800

03 Apr, 2020

1 commit

cdf6cdcd3 dm,dax: Add dax zero_page_range operation ... Browse Code »

This patch adds support for dax zero_page_range operation to dm targets.

Signed-off-by: Vivek Goyal
Acked-by: Mike Snitzer
Link: https://lore.kernel.org/r/20200228163456.1587-5-vgoyal@redhat.com
Signed-off-by: Dan Williams

Vivek Goyal
2020-04-03 10:15:03 +0800

26 Nov, 2019

1 commit

eeee2827a Merge tag 'for-5.5/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/… ... Browse Code »

…device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

- Fix DM core to disallow stacking request-based DM on partitions.

- Fix DM raid target to properly resync raidset even if bitmap needed
additional pages.

- Fix DM crypt performance regression due to use of WQ_HIGHPRI for the
IO and crypt workqueues.

- Fix DM integrity metadata layout that was aligned on 128K boundary
rather than the intended 4K boundary (removes 124K of wasted space
for each metadata block).

- Improve the DM thin, cache and clone targets to use spin_lock_irq
rather than spin_lock_irqsave where possible.

- Fix DM thin single thread performance that was lost due to needless
workqueue wakeups.

- Fix DM zoned target performance that was lost due to excessive
backing device checks.

- Add ability to trigger write failure with the DM dust test target.

- Fix whitespace indentation in drivers/md/Kconfig.

- Various smalls fixes and cleanups (e.g. use struct_size, fix
uninitialized variable, variable renames, etc).

* tag 'for-5.5/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (22 commits)
Revert "dm crypt: use WQ_HIGHPRI for the IO and crypt workqueues"
dm: Fix Kconfig indentation
dm thin: wakeup worker only when deferred bios exist
dm integrity: fix excessive alignment of metadata runs
dm raid: Remove unnecessary negation of a shift in raid10_format_to_md_layout
dm zoned: reduce overhead of backing device checks
dm dust: add limited write failure mode
dm dust: change ret to r in dust_map_read and dust_map
dm dust: change result vars to r
dm cache: replace spin_lock_irqsave with spin_lock_irq
dm bio prison: replace spin_lock_irqsave with spin_lock_irq
dm thin: replace spin_lock_irqsave with spin_lock_irq
dm clone: add bucket_lock_irq/bucket_unlock_irq helpers
dm clone: replace spin_lock_irqsave with spin_lock_irq
dm writecache: handle REQ_FUA
dm writecache: fix uninitialized variable warning
dm stripe: use struct_size() in kmalloc()
dm raid: streamline rs_get_progress() and its raid_status() caller side
dm raid: simplify rs_setup_recovery call chain
dm raid: to ensure resynchronization, perform raid set grow in preresume
...

Linus Torvalds
2019-11-26 03:53:26 +0800

13 Nov, 2019

1 commit

d41003513 block: rework zone reporting ... Browse Code »

Avoid the need to allocate a potentially large array of struct blk_zone
in the block layer by switching the ->report_zones method interface to
a callback model. Now the caller simply supplies a callback that is
executed on each reported zone, and private data for it.

Signed-off-by: Christoph Hellwig
Signed-off-by: Shin'ichiro Kawasaki
Signed-off-by: Damien Le Moal
Reviewed-by: Hannes Reinecke
Reviewed-by: Mike Snitzer
Signed-off-by: Jens Axboe

Christoph Hellwig
2019-11-13 10:12:07 +0800

06 Nov, 2019

1 commit

8adeac3be dm stripe: use struct_size() in kmalloc() ... Browse Code »

One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:

struct stripe_c {
...
struct stripe stripe[0];
};

In this case alloc_context() and dm_array_too_big() are removed and
replaced by the direct use of the struct_size() helper in kmalloc().

Notice that open-coded form is prone to type mistakes.

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva
Signed-off-by: Mike Snitzer

Gustavo A. R. Silva
2019-11-06 03:09:59 +0800

19 Jul, 2019

1 commit

3bfe1fc46 Merge tag 'for-5.3/dm-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/gi… ... Browse Code »

…t/device-mapper/linux-dm

Pull more device mapper updates from Mike Snitzer:

- Fix zone state management race in DM zoned target by eliminating the
unnecessary DMZ_ACTIVE state.

- A couple fixes for issues the DM snapshot target's optional discard
support added during first week of the 5.3 merge.

- Increase default size of outstanding IO that is allowed for a each
dm-kcopyd client and introduce tunable to allow user adjust.

- Update DM core to use printk ratelimiting functions rather than
duplicate them and in doing so fix an issue where DMDEBUG_LIMIT()
rate limited KERN_DEBUG messages had excessive "callbacks suppressed"
messages.

* tag 'for-5.3/dm-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm: use printk ratelimiting functions
dm kcopyd: Increase default sub-job size to 512KB
dm snapshot: fix oversights in optional discard support
dm zoned: fix zone state management race

Linus Torvalds
2019-07-19 05:49:33 +0800

18 Jul, 2019

1 commit

733232f8c dm: use printk ratelimiting functions ... Browse Code »

DM provided its own ratelimiting printk wrapper but given printk
advances this is no longer needed.

Also, switching DMDEBUG_LIMIT to using pr_debug_ratelimited() fixes the
reported issue where DMDEBUG_LIMIT() still caused a flood of "callbacks
suppressed" messages.

Reported-by: Milan Broz
Depends-on: 29fc2bc7539386 ("printk: pr_debug_ratelimited: check state first to reduce "callbacks suppressed" messages")
Signed-off-by: Mike Snitzer

Mike Snitzer
2019-07-18 01:09:32 +0800

12 Jul, 2019

1 commit

bd976e527 block: Kill gfp_t argument of blkdev_report_zones() ... Browse Code »

Only GFP_KERNEL and GFP_NOIO are used with blkdev_report_zones(). In
preparation of using vmalloc() for large report buffer and zone array
allocations used by this function, remove its "gfp_t gfp_mask" argument
and rely on the caller context to use memalloc_noio_save/restore() where
necessary (block layer zone revalidation and dm-zoned I/O error path).

Reviewed-by: Christoph Hellwig
Reviewed-by: Martin K. Petersen
Signed-off-by: Damien Le Moal
Signed-off-by: Jens Axboe

Damien Le Moal
2019-07-12 10:04:37 +0800

26 Apr, 2019

1 commit

5de719e3d dm mpath: fix missing call of path selector type->end_io ... Browse Code »

After commit 396eaf21ee17 ("blk-mq: improve DM's blk-mq IO merging via
blk_insert_cloned_request feedback"), map_request() will requeue the tio
when issued clone request return BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE.

Thus, if device driver status is error, a tio may be requeued multiple
times until the return value is not DM_MAPIO_REQUEUE. That means
type->start_io may be called multiple times, while type->end_io is only
called when IO complete.

In fact, even without commit 396eaf21ee17, setup_clone() failure can
also cause tio requeue and associated missed call to type->end_io.

The service-time path selector selects path based on in_flight_size,
which is increased by st_start_io() and decreased by st_end_io().
Missed calls to st_end_io() can lead to in_flight_size count error and
will cause the selector to make the wrong choice. In addition,
queue-length path selector will also be affected.

To fix the problem, call type->end_io in ->release_clone_rq before tio
requeue. map_info is passed to ->release_clone_rq() for map_request()
error path that result in requeue.

Fixes: 396eaf21ee17 ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback")
Cc: stable@vger.kernl.org
Signed-off-by: Yufen Yu
Signed-off-by: Mike Snitzer

Yufen Yu
2019-04-26 03:38:52 +0800

06 Mar, 2019

2 commits

6bbc923df dm: add support to directly boot to a mapped device ... Browse Code »

Add a "create" module parameter, which allows device-mapper targets to
be configured at boot time. This enables early use of DM targets in the
boot process (as the root device or otherwise) without the need of an
initramfs.

The syntax used in the boot param is based on the concise format from
the dmsetup tool to follow the rule of least surprise:

dmsetup table --concise /dev/mapper/lroot

Which is:
dm-mod.create=,,,,[,+][;,,,,[,+]+]

Where,
::= The device name.
::= xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | ""
::= The device minor number | ""
::= "ro" | "rw"
::=
::= "verity" | "linear" | ...

For example, the following could be added in the boot parameters:
dm-mod.create="lroot,,,rw, 0 4096 linear 98:16 0, 4096 4096 linear 98:32 0" root=/dev/dm-0

Only the targets that were tested are allowed and the ones that don't
change any block device when the device is create as read-only. For
example, mirror and cache targets are not allowed. The rationale behind
this is that if the user makes a mistake, choosing the wrong device to
be the mirror or the cache can corrupt data.

The only targets initially allowed are:
* crypt
* delay
* linear
* snapshot-origin
* striped
* verity

Co-developed-by: Will Drewry
Co-developed-by: Kees Cook
Co-developed-by: Enric Balletbo i Serra
Signed-off-by: Helen Koike
Reviewed-by: Kees Cook
Signed-off-by: Mike Snitzer

Helen Koike
2019-03-06 03:53:50 +0800
0bdb50c53 dm: fix to_sector() for 32bit ... Browse Code »

A dm-raid array with devices larger than 4GB won't assemble on
a 32 bit host since _check_data_dev_sectors() was added in 4.16.
This is because to_sector() treats its argument as an "unsigned long"
which is 32bits (4GB) on a 32bit host. Using "unsigned long long"
is more correct.

Kernels as early as 4.2 can have other problems due to to_sector()
being used on the size of a device.

Fixes: 0cf4503174c1 ("dm raid: add support for the MD RAID0 personality")
cc: stable@vger.kernel.org (v4.2+)
Reported-and-tested-by: Guillaume Perréal
Signed-off-by: NeilBrown
Signed-off-by: Mike Snitzer

NeilBrown
2019-03-06 03:48:52 +0800

21 Feb, 2019

1 commit

61697a6ab dm: eliminate 'split_discard_bios' flag from DM target interface ... Browse Code »

There is no need to have DM core split discards on behalf of a DM target
now that blk_queue_split() handles splitting discards based on the
queue_limits. A DM target just needs to set max_discard_sectors,
discard_granularity, etc, in queue_limits.

Signed-off-by: Mike Snitzer

Mike Snitzer
2019-02-21 12:24:55 +0800

27 Oct, 2018

1 commit

71f4d95b2 Merge tag 'for-4.20/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git… ... Browse Code »

…/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

- The biggest change this cycle is to remove support for the legacy IO
path (.request_fn) from request-based DM.

Jens has already started preparing for complete removal of the legacy
IO path in 4.21 but this earlier removal of support from DM has been
coordinated with Jens (as evidenced by the commit being attributed to
him).

Making request-based DM exclussively blk-mq only cleans up that
portion of DM core quite nicely.

- Convert the thinp and zoned targets over to using refcount_t where
applicable.

- A couple fixes to the DM zoned target for refcounting and other races
buried in the implementation of metadata block creation and use.

- Small cleanups to remove redundant unlikely() around a couple
WARN_ON_ONCE().

- Simplify how dm-ioctl copies from userspace, eliminating some
potential for a malicious user trying to change the executed ioctl
after its processing has begun.

- Tweaked DM crypt target to use the DM device name when naming the
various workqueues created for a particular DM crypt device (makes
the N workqueues for a DM crypt device more easily understood and
enhances user's accounting capabilities at a glance via "ps")

- Small fixup to remove dead branch in DM writecache's memory_entry().

* tag 'for-4.20/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm writecache: remove disabled code in memory_entry()
dm zoned: fix various dmz_get_mblock() issues
dm zoned: fix metadata block ref counting
dm raid: avoid bitmap with raid4/5/6 journal device
dm crypt: make workqueue names device-specific
dm: add dm_table_device_name()
dm ioctl: harden copy_params()'s copy_from_user() from malicious users
dm: remove unnecessary unlikely() around WARN_ON_ONCE()
dm zoned: target: use refcount_t for dm zoned reference counters
dm thin: use refcount_t for thin_c reference counting
dm table: require that request-based DM be layered on blk-mq devices
dm: rename DM_TYPE_MQ_REQUEST_BASED to DM_TYPE_REQUEST_BASED
dm: remove legacy request-based IO path

Linus Torvalds
2018-10-27 03:57:38 +0800

26 Oct, 2018

1 commit

e76239a37 block: add a report_zones method ... Browse Code »

Dispatching a report zones command through the request queue is a major
pain due to the command reply payload rewriting necessary. Given that
blkdev_report_zones() is executing everything synchronously, implement
report zones as a block device file operation instead, allowing major
simplification of the code in many places.

sd, null-blk, dm-linear and dm-flakey being the only block device
drivers supporting exposing zoned block devices, these drivers are
modified to provide the device side implementation of the
report_zones() block device file operation.

For device mappers, a new report_zones() target type operation is
defined so that the upper block layer calls blkdev_report_zones() can
be propagated down to the underlying devices of the dm targets.
Implementation for this new operation is added to the dm-linear and
dm-flakey targets.

Reviewed-by: Hannes Reinecke
Signed-off-by: Christoph Hellwig
[Damien]
* Changed method block_device argument to gendisk
* Various bug fixes and improvements
* Added support for null_blk, dm-linear and dm-flakey.
Reviewed-by: Martin K. Petersen
Reviewed-by: Mike Snitzer
Signed-off-by: Damien Le Moal
Signed-off-by: Jens Axboe

Christoph Hellwig
2018-10-26 01:17:40 +0800

19 Oct, 2018

1 commit

f349b0a3e dm: add dm_table_device_name() ... Browse Code »

Add a shortcut for dm_device_name(dm_table_get_md(t)).

Signed-off-by: Michał Mirosław
Signed-off-by: Mike Snitzer

Michał Mirosław
2018-10-19 00:10:02 +0800

11 Oct, 2018

1 commit

953923c09 dm: rename DM_TYPE_MQ_REQUEST_BASED to DM_TYPE_REQUEST_BASED ... Browse Code »

Now that request-based DM is only using blk-mq, there is no need to
differentiate between legacy "rq" and new "mq". We're back to a single
request-based DM -- and there was much rejoicing!

Signed-off-by: Mike Snitzer

Mike Snitzer
2018-10-11 23:36:09 +0800

23 May, 2018

1 commit

b3a9a0c36 dax: Introduce a ->copy_to_iter dax operation ... Browse Code »

Similar to the ->copy_from_iter() operation, a platform may want to
deploy an architecture or device specific routine for handling reads
from a dax_device like /dev/pmemX. On x86 this routine will point to a
machine check safe version of copy_to_iter(). For now, add the plumbing
to device-mapper and the dax core.

Cc: Ross Zwisler
Cc: Mike Snitzer
Cc: Christoph Hellwig
Signed-off-by: Dan Williams

Dan Williams
2018-05-23 14:18:31 +0800

07 Apr, 2018

1 commit

83c7c18b1 Merge tag 'for-4.17/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git… ... Browse Code »

…/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

- DM core passthrough ioctl fix to retain reference to DM table, and
that table's block devices, while issuing the ioctl to one of those
block devices.

- DM core passthrough ioctl fix to _not_ override the fmode_t used to
issue the ioctl. Overriding by using the fmode_t that the block
device was originally open with during DM table load is a liability.

- Add DM core support for secure erase forwarding and update the DM
linear and DM striped targets to support them.

- A DM core 4.16 stable fix to allow abnormal IO (e.g. discard, write
same, write zeroes) for targets that make use of the non-splitting IO
variant (as is done for multipath or thinp when layered directly on
NVMe).

- Allow DM targets to return a payload in response to a DM message that
they are sent. This is useful for DM targets that would like to
provide statistics data in response to DM messages.

- Update DM bufio to support non-power-of-2 block sizes. Numerous other
related changes prepare the DM bufio code for this support.

- Fix DM crypt to use a bounded amount of memory across the entire
system. This is to avoid OOM that can otherwise occur in response to
certain pathological IO workloads (e.g. discarding a large DM crypt
device).

- Add a 'check_at_most_once' feature to the DM verity target to allow
verity to be used on mobile devices that have very limited resources.

- Fix the DM integrity target to fail early if a keyed algorithm (e.g.
HMAC) is to be used but the key isn't set.

- Add non-power-of-2 support to the DM unstripe target.

- Eliminate the use of a Variable Length Array in the DM stripe target.

- Update the DM log-writes target to record metadata (REQ_META flag).

- DM raid fixes for its nosync status and some variable range issues.

* tag 'for-4.17/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (28 commits)
dm: remove fmode_t argument from .prepare_ioctl hook
dm: hold DM table for duration of ioctl rather than use blkdev_get
dm raid: fix parse_raid_params() variable range issue
dm verity: make verity_for_io_block static
dm verity: add 'check_at_most_once' option to only validate hashes once
dm bufio: don't embed a bio in the dm_buffer structure
dm bufio: support non-power-of-two block sizes
dm bufio: use slab cache for dm_buffer structure allocations
dm bufio: reorder fields in dm_buffer structure
dm bufio: relax alignment constraint on slab cache
dm bufio: remove code that merges slab caches
dm bufio: get rid of slab cache name allocations
dm bufio: move dm-bufio.h to include/linux/
dm bufio: delete outdated comment
dm: add support for secure erase forwarding
dm: backfill abnormal IO support to non-splitting IO submission
dm raid: fix nosync status
dm mpath: use DM_MAPIO_SUBMITTED instead of magic number 0 in process_queued_bios()
dm stripe: get rid of a Variable Length Array (VLA)
dm log writes: record metadata flag for better flags record
...

Linus Torvalds
2018-04-07 02:50:19 +0800

05 Apr, 2018

1 commit

5bd5e8d89 dm: remove fmode_t argument from .prepare_ioctl hook ... Browse Code »

Use the fmode_t that is passed to dm_blk_ioctl() rather than
inconsistently (varies across targets) drop it on the floor by
overriding it with the fmode_t stored in 'struct dm_dev'.

All the persistent reservation functions weren't using the fmode_t they
got back from .prepare_ioctl so remove them.

Signed-off-by: Mike Snitzer

Mike Snitzer
2018-04-05 00:12:39 +0800

04 Apr, 2018

2 commits

00716545c dm: add support for secure erase forwarding ... Browse Code »

Set QUEUE_FLAG_SECERASE in DM device's queue_flags if a DM table's
data devices support secure erase.

Also, add support for secure erase to both the linear and striped
targets.

Signed-off-by: Denis Semakin
Signed-off-by: Mike Snitzer

Denis Semakin
2018-04-04 03:04:21 +0800
1eb5fa849 dm: allow targets to return output from messages they are sent ... Browse Code »

Could be useful for a target to return stats or other information.
If a target does DMEMIT() anything to @result from its .message method
then it must return 1 to the caller.

Signed-off-By: Mike Snitzer

Mike Snitzer
2018-04-04 03:04:10 +0800

18 Mar, 2018

1 commit

233bde21a block: Move SECTOR_SIZE and SECTOR_SHIFT definitions into <linux/blkdev.h> ... Browse Code »

It happens often while I'm preparing a patch for a block driver that
I'm wondering: is a definition of SECTOR_SIZE and/or SECTOR_SHIFT
available for this driver? Do I have to introduce definitions of these
constants before I can use these constants? To avoid this confusion,
move the existing definitions of SECTOR_SIZE and SECTOR_SHIFT into the
header file such that these become available for all
block drivers. Make the SECTOR_SIZE definition in the uapi msdos_fs.h
header file conditional to avoid that including that header file after
causes the compiler to complain about a SECTOR_SIZE
redefinition.

Note: the SECTOR_SIZE / SECTOR_SHIFT / SECTOR_BITS definitions have
not been removed from uapi header files nor from NAND drivers in
which these constants are used for another purpose than converting
block layer offsets and sizes into a number of sectors.

Cc: David S. Miller
Cc: Mike Snitzer
Cc: Dan Williams
Cc: Minchan Kim
Cc: Nitin Gupta
Reviewed-by: Sergey Senozhatsky
Reviewed-by: Christoph Hellwig
Reviewed-by: Johannes Thumshirn
Reviewed-by: Martin K. Petersen
Signed-off-by: Bart Van Assche
Signed-off-by: Jens Axboe

Bart Van Assche
2018-03-18 04:45:23 +0800

30 Jan, 2018

1 commit

ac514ffc9 dm mpath: delay the retry of a request if the target responded as busy ... Browse Code »

Add DM_ENDIO_DELAY_REQUEUE to allow request-based multipath's
multipath_end_io() to instruct dm-rq.c:dm_done() to delay a requeue.
This is beneficial to do if BLK_STS_RESOURCE is returned from the target
(because target is busy).

Relative to blk-mq: kick the hw queues via blk_mq_requeue_work(),
indirectly from dm-rq.c:__dm_mq_kick_requeue_list(), after a delay.

For old .request_fn: use blk_delay_queue().

bio-based multipath doesn't have feature parity with request-based for
retryable error requeues; that is something that'll need fixing in the
future.

Suggested-by: Bart Van Assche
Signed-off-by: Mike Snitzer
Acked-by: Bart Van Assche
[as interpreted from Bart's "... patch looks fine to me."]

Mike Snitzer
2018-01-30 02:44:54 +0800

17 Jan, 2018

1 commit

f6e7baadd dm: move dm_table_destroy() to same header as dm_table_create() ... Browse Code »

If anyone is going to use dm_table_create(), they probably should be
able to use dm_table_destroy() too. Move the dm_table_destroy()
definition outside the private header, near dm_table_create()

Signed-off-by: Brian Norris
Signed-off-by: Mike Snitzer

Brian Norris
2018-01-17 22:16:06 +0800

20 Dec, 2017

1 commit

22c11858e dm: introduce DM_TYPE_NVME_BIO_BASED ... Browse Code »

If dm_table_determine_type() establishes DM_TYPE_NVME_BIO_BASED then
all devices in the DM table do not support partial completions. Also,
the table has a single immutable target that doesn't require DM core to
split bios.

This will enable adding NVMe optimizations to bio-based DM.

Signed-off-by: Mike Snitzer

Mike Snitzer
2017-12-20 23:51:10 +0800

17 Dec, 2017

1 commit

64f52b0e3 dm: improve performance by moving dm_io structure to per-bio-data ... Browse Code »

Eliminates need for a separate mempool to allocate 'struct dm_io'
objects from. As such, it saves an extra mempool allocation for each
original bio that DM core is issued.

This complicates the per-bio-data accessor functions by needing to
conditonally add extra padding to get to a target's per-bio-data. But
in the end this provides a decent performance improvement for all
bio-based DM devices.

On an NVMe-loop based testbed to a ramdisk (~3100 MB/s): bio-based
DM linear performance improved by 2% (went from 2665 to 2777 MB/s).

Signed-off-by: Mike Snitzer

Mike Snitzer
2017-12-17 09:43:13 +0800

14 Dec, 2017

1 commit

f31c21e43 dm: remove unused 'num_write_bios' target interface ... Browse Code »

No DM target provides num_write_bios and none has since dm-cache's
brief use in 2013.

Having the possibility of num_write_bios > 1 complicates bio
allocation. So remove the interface and assume there is only one bio
needed.

If a target ever needs more, it must provide a suitable bioset and
allocate itself based on its particular needs.

Signed-off-by: NeilBrown
Signed-off-by: Mike Snitzer

NeilBrown
2017-12-14 01:15:58 +0800

11 Sep, 2017

1 commit

c3ca015fa dax: remove the pmem_dax_ops->flush abstraction ... Browse Code »

Commit abebfbe2f731 ("dm: add ->flush() dax operation support") is
buggy. A DM device may be composed of multiple underlying devices and
all of them need to be flushed. That commit just routes the flush
request to the first device and ignores the other devices.

It could be fixed by adding more complex logic to the device mapper. But
there is only one implementation of the method pmem_dax_ops->flush - that
is pmem_dax_flush() - and it calls arch_wb_cache_pmem(). Consequently, we
don't need the pmem_dax_ops->flush abstraction at all, we can call
arch_wb_cache_pmem() directly from dax_flush() because dax_dev->ops->flush
can't ever reach anything different from arch_wb_cache_pmem().

It should be also pointed out that for some uses of persistent memory it
is needed to flush only a very small amount of data (such as 1 cacheline),
and it would be overkill if we go through that device mapper machinery for
a single flushed cache line.

Fix this by removing the pmem_dax_ops->flush abstraction and call
arch_wb_cache_pmem() directly from dax_flush(). Also, remove the device
mapper code that forwards the flushes.

Fixes: abebfbe2f731 ("dm: add ->flush() dax operation support")
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka
Reviewed-by: Dan Williams
Signed-off-by: Mike Snitzer

Mikulas Patocka
2017-09-11 23:00:55 +0800

28 Aug, 2017

2 commits

5916a22b8 dm: constify argument arrays ... Browse Code »

The arrays of 'struct dm_arg' are never modified by the device-mapper
core, so constify them so that they are placed in .rodata.

(Exception: the args array in dm-raid cannot be constified because it is
allocated on the stack and modified.)

Signed-off-by: Eric Biggers
Signed-off-by: Mike Snitzer

Eric Biggers
2017-08-28 23:47:18 +0800
604407890 dm: fix printk() rate limiting code ... Browse Code »

Using the same rate limiting state for different kinds of messages
is wrong because this can cause a high frequency message to suppress
a report of a low frequency message. Hence use a unique rate limiting
state per message type.

Fixes: 71a16736a15e ("dm: use local printk ratelimit")
Cc: stable@vger.kernel.org
Signed-off-by: Bart Van Assche
Signed-off-by: Mike Snitzer

Bart Van Assche
2017-08-28 21:58:27 +0800

08 Jul, 2017

1 commit

b6ffe9ba4 Merge tag 'libnvdimm-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm ... Browse Code »

Pull libnvdimm updates from Dan Williams:
"libnvdimm updates for the latest ACPI and UEFI specifications. This
pull request also includes new 'struct dax_operations' enabling to
undo the abuse of copy_user_nocache() for copy operations to pmem.

The dax work originally missed 4.12 to address concerns raised by Al.

Summary:

- Introduce the _flushcache() family of memory copy helpers and use
them for persistent memory write operations on x86. The
_flushcache() semantic indicates that the cache is either bypassed
for the copy operation (movnt) or any lines dirtied by the copy
operation are written back (clwb, clflushopt, or clflush).

- Extend dax_operations with ->copy_from_iter() and ->flush()
operations. These operations and other infrastructure updates allow
all persistent memory specific dax functionality to be pushed into
libnvdimm and the pmem driver directly. It also allows dax-specific
sysfs attributes to be linked to a host device, for example:
/sys/block/pmem0/dax/write_cache

- Add support for the new NVDIMM platform/firmware mechanisms
introduced in ACPI 6.2 and UEFI 2.7. This support includes the v1.2
namespace label format, extensions to the address-range-scrub
command set, new error injection commands, and a new BTT
(block-translation-table) layout. These updates support inter-OS
and pre-OS compatibility.

- Fix a longstanding memory corruption bug in nfit_test.

- Make the pmem and nvdimm-region 'badblocks' sysfs files poll(2)
capable.

- Miscellaneous fixes and small updates across libnvdimm and the nfit
driver.

Acknowledgements that came after the branch was pushed: commit
6aa734a2f38e ("libnvdimm, region, pmem: fix 'badblocks'
sysfs_get_dirent() reference lifetime") was reviewed by Toshi Kani
"

* tag 'libnvdimm-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (42 commits)
libnvdimm, namespace: record 'lbasize' for pmem namespaces
acpi/nfit: Issue Start ARS to retrieve existing records
libnvdimm: New ACPI 6.2 DSM functions
acpi, nfit: Show bus_dsm_mask in sysfs
libnvdimm, acpi, nfit: Add bus level dsm mask for pass thru.
acpi, nfit: Enable DSM pass thru for root functions.
libnvdimm: passthru functions clear to send
libnvdimm, btt: convert some info messages to warn/err
libnvdimm, region, pmem: fix 'badblocks' sysfs_get_dirent() reference lifetime
libnvdimm: fix the clear-error check in nsio_rw_bytes
libnvdimm, btt: fix btt_rw_page not returning errors
acpi, nfit: quiet invalid block-aperture-region warnings
libnvdimm, btt: BTT updates for UEFI 2.7 format
acpi, nfit: constify *_attribute_group
libnvdimm, pmem: disable dax flushing when pmem is fronting a volatile region
libnvdimm, pmem, dax: export a cache control attribute
dax: convert to bitmask for flags
dax: remove default copy_from_iter fallback
libnvdimm, nfit: enable support for volatile ranges
libnvdimm, pmem: fix persistence warning
...

Linus Torvalds
2017-07-08 00:44:06 +0800

19 Jun, 2017

3 commits

10999307c dm: introduce dm_remap_zone_report() ... Browse Code »

A target driver support zoned block devices and exposing it as such may
receive REQ_OP_ZONE_REPORT request for the user to determine the mapped
device zone configuration. To process properly such request, the target
driver may need to remap the zone descriptors provided in the report
reply. The helper function dm_remap_zone_report() does this generically
using only the target start offset and length and the start offset
within the target device.

dm_remap_zone_report() will remap the start sector of all zones
reported. If the report includes sequential zones, the write pointer
position of these zones will also be remapped.

Signed-off-by: Damien Le Moal
Reviewed-by: Hannes Reinecke
Reviewed-by: Bart Van Assche
Signed-off-by: Mike Snitzer

Damien Le Moal
2017-06-19 23:03:51 +0800
dd88d313b dm table: add zoned block devices validation ... Browse Code »

1) Introduce DM_TARGET_ZONED_HM feature flag:

The target drivers currently available will not operate correctly if a
table target maps onto a host-managed zoned block device.

To avoid problems, introduce the new feature flag DM_TARGET_ZONED_HM to
allow a target to explicitly state that it supports host-managed zoned
block devices. This feature is checked for all targets in a table if
any of the table's block devices are host-managed.

Note that as host-aware zoned block devices are backward compatible with
regular block devices, they can be used by any of the current target
types. This new feature is thus restricted to host-managed zoned block
devices.

2) Check device area zone alignment:

If a target maps to a zoned block device, check that the device area is
aligned on zone boundaries to avoid problems with REQ_OP_ZONE_RESET
operations (resetting a partially mapped sequential zone would not be
possible). This also facilitates the processing of zone report with
REQ_OP_ZONE_REPORT bios.

3) Check block devices zone model compatibility

When setting the DM device's queue limits, several possibilities exists
for zoned block devices:
1) The DM target driver may want to expose a different zone model
(e.g. host-managed device emulation or regular block device on top of
host-managed zoned block devices)
2) Expose the underlying zone model of the devices as-is

To allow both cases, the underlying block device zone model must be set
in the target limits in dm_set_device_limits() and the compatibility of
all devices checked similarly to the logical block size alignment. For
this last check, introduce validate_hardware_zoned_model() to check that
all targets of a table have the same zone model and that the zone size
of the target devices are equal.

Signed-off-by: Damien Le Moal
Reviewed-by: Hannes Reinecke
Reviewed-by: Bart Van Assche
[Mike Snitzer refactored Damien's original work to simplify the code]
Signed-off-by: Mike Snitzer

Damien Le Moal
2017-06-19 23:03:50 +0800
d2c3c8dcb dm: convert DM printk macros to pr_<level> macros ... Browse Code »

Using pr_ is the more common logging style.

Standardize style and use new macro DM_FMT.
Use no_printk in DMDEBUG macros when CONFIG_DM_DEBUG is not #defined.

Signed-off-by: Joe Perches
Signed-off-by: Mike Snitzer

Joe Perches
2017-06-19 23:03:50 +0800