Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

30 Jan, 2015

1 commit

538c2bc4e dm cache: fix problematic dual use of a single migration count variable ... Browse Code »

commit a59db67656021fa212e9b95a583f13c34eb67cd9 upstream.

Introduce a new variable to count the number of allocated migration
structures. The existing variable cache->nr_migrations became
overloaded. It was used to:

i) track of the number of migrations in flight for the purposes of
quiescing during suspend.

ii) to estimate the amount of background IO occuring.

Recent discard changes meant that REQ_DISCARD bios are processed with
a migration. Discards are not background IO so nr_migrations was not
incremented. However this could cause quiescing to complete early.

(i) is now handled with a new variable cache->nr_allocated_migrations.
cache->nr_migrations has been renamed cache->nr_io_migrations.
cleanup_migration() is now called free_io_migration(), since it
decrements that variable.

Also, remove the unused cache->next_migration variable that got replaced
with with prealloc_structs a while ago.

Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer
Signed-off-by: Greg Kroah-Hartman

Joe Thornber
2015-01-30 09:40:44 +0800

09 Jan, 2015

3 commits

734a3fb29 dm cache: fix spurious cell_defer when dealing with partial block at end of device ... Browse Code »

commit f824a2af3dfbbb766c02e19df21f985bceadf0ee upstream.

We never bother caching a partial block that is at the back end of the
origin device. No cell ever gets locked, but the calling code was
assuming it was and trying to release it.

Now the code only releases if the cell has been set to a non NULL
value.

Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer
Signed-off-by: Greg Kroah-Hartman

Joe Thornber
2015-01-09 02:30:19 +0800
7a9cdc4c9 dm cache: dirty flag was mistakenly being cleared when promoting via overwrite ... Browse Code »

commit 1e32134a5a404e80bfb47fad8a94e9bbfcbdacc5 upstream.

If the incoming bio is a WRITE and completely covers a block then we
don't bother to do any copying for a promotion operation. Once this is
done the cache block and origin block will be different, so we need to
set it to 'dirty'.

Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer
Signed-off-by: Greg Kroah-Hartman

Joe Thornber
2015-01-09 02:30:19 +0800
4df99e3c8 dm cache: only use overwrite optimisation for promotion when in writeback mode ... Browse Code »

commit f29a3147e251d7ae20d3194ff67f109d71e501b4 upstream.

Overwrite causes the cache block and origin blocks to diverge, which
is only allowed in writeback mode.

Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer
Signed-off-by: Greg Kroah-Hartman

Joe Thornber
2015-01-09 02:30:18 +0800

10 Sep, 2014

1 commit

40aa978ec dm cache: fix race causing dirty blocks to be marked as clean ... Browse Code »
5

When a writeback or a promotion of a block is completed, the cell of
that block is removed from the prison, the block is marked as clean, and
the clear_dirty() callback of the cache policy is called.

Unfortunately, performing those actions in this order allows an incoming
new write bio for that block to come in before clearing the dirty status
is completed and therefore possibly causing one of these two scenarios:

Scenario A:

Thread 1 Thread 2
cell_defer() .
- cell removed from prison .
- detained bios queued .
. incoming write bio
. remapped to cache
. set_dirty() called,
. but block already dirty
. => it does nothing
clear_dirty() .
- block marked clean .
- policy clear_dirty() called .

Result: Block is marked clean even though it is actually dirty. No
writeback will occur.

Scenario B:

Thread 1 Thread 2
cell_defer() .
- cell removed from prison .
- detained bios queued .
clear_dirty() .
- block marked clean .
. incoming write bio
. remapped to cache
. set_dirty() called
. - block marked dirty
. - policy set_dirty() called
- policy clear_dirty() called .

Result: Block is properly marked as dirty, but policy thinks it is clean
and therefore never asks us to writeback it.
This case is visible in "dmsetup status" dirty block count (which
normally decreases to 0 on a quiet device).

Fix these issues by calling clear_dirty() before calling cell_defer().
Incoming bios for that block will then be detained in the cell and
released only after clear_dirty() has completed, so the race will not
occur.

Found by inspecting the code after noticing spurious dirty counts
(scenario B).

Signed-off-by: Anssi Hannula
Acked-by: Joe Thornber
Signed-off-by: Mike Snitzer
Cc: stable@vger.kernel.org

Anssi Hannula
2014-09-10 23:20:47 +0800

02 Aug, 2014

5 commits

b02465308 dm cache: set minimum_io_size to cache's data block size ... Browse Code »

Before, if the block layer's limit stacking didn't establish an
optimal_io_size that was compatible with the cache's data block size
we'd set optimal_io_size to the data block size and minimum_io_size to 0
(which the block layer adjusts to be physical_block_size).

Update cache_io_hints() to set both minimum_io_size and optimal_io_size
to the cache's data block size. This fixes an issue where mkfs.xfs
would create more XFS Allocation Groups on cache volumes than on a
normal linear LV of comparable size.

Signed-off-by: Mike Snitzer

Mike Snitzer
2014-08-02 00:30:36 +0800
895b47d79 dm cache metadata: use dm-space-map-metadata.h defined size limits ... Browse Code »

Commit 7d48935e cleaned up the persistent-data's space-map-metadata
limits by elevating them to dm-space-map-metadata.h. Update
dm-cache-metadata to use these same limits.

The calculation for DM_CACHE_METADATA_MAX_SECTORS didn't account for the
sizeof the disk_bitmap_header. So the supported maximum metadata size
is a bit smaller (reduced from 33423360 to 33292800 sectors).

Signed-off-by: Mike Snitzer
Acked-by: Joe Thornber

Mike Snitzer
2014-08-02 00:30:33 +0800
304affaa8 dm cache: fail migrations in the do_worker error path ... Browse Code »

Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer

Joe Thornber
2014-08-02 00:30:33 +0800
8c081b52c dm cache: simplify deferred set reference count increments ... Browse Code »

Factor out inc_and_issue and inc_ds helpers to simplify deferred set
reference count increments. Also cleanup cache_map to consistently call
cell_defer and inc_ds when the bio is DM_MAPIO_REMAPPED.

No functional change.

Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer

Joe Thornber
2014-08-02 00:30:32 +0800
44fa816bb dm cache: fix race affecting dirty block count ... Browse Code »
6

nr_dirty is updated without locking, causing it to drift so that it is
non-zero (either a small positive integer, or a very large one when an
underflow occurs) even when there are no actual dirty blocks. This was
due to a race between the workqueue and map function accessing nr_dirty
in parallel without proper protection.

People were seeing under runs due to a race on increment/decrement of
nr_dirty, see: https://lkml.org/lkml/2014/6/3/648

Fix this by using an atomic_t for nr_dirty.

Reported-by: roma1390@gmail.com
Signed-off-by: Anssi Hannula
Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer
Cc: stable@vger.kernel.org

Anssi Hannula
2014-08-02 00:25:22 +0800

27 May, 2014

1 commit

f1daa838e dm cache: always split discards on cache block boundaries ... Browse Code »
5

The DM cache target cannot cope with discards that span multiple cache
blocks, so each discard bio that spans more than one cache block must
get split by the DM core.

Signed-off-by: Heinz Mauelshagen
Acked-by: Joe Thornber
Signed-off-by: Mike Snitzer
Cc: stable@vger.kernel.org # v3.9+

Heinz Mauelshagen
2014-05-27 22:33:05 +0800

02 May, 2014

1 commit

131cd131a dm cache: fix writethrough mode quiescing in cache_map ... Browse Code »
5

Commit 2ee57d58735 ("dm cache: add passthrough mode") inadvertently
removed the deferred set reference that was taken in cache_map()'s
writethrough mode support. Restore taking this reference.

This issue was found with code inspection.

Signed-off-by: Mike Snitzer
Acked-by: Joe Thornber
Cc: stable@vger.kernel.org # 3.13+

Mike Snitzer
2014-05-02 04:14:24 +0800

05 Apr, 2014

1 commit

0596661f0 dm cache: fix a lock-inversion ... Browse Code »
5

When suspending a cache the policy is walked and the individual policy
hints written to the metadata via sync_metadata(). This led to this
lock order:

policy->lock
cache_metadata->root_lock

When loading the cache target the policy is populated while the metadata
lock is held:

cache_metadata->root_lock
policy->lock

Fix this potential lock-inversion (ABBA) deadlock in sync_metadata() by
ensuring the cache_metadata root_lock is held whilst all the hints are
written, rather than being repeatedly locked while policy->lock is held
(as was the case with each callout that policy_walk_mappings() made to
the old save_hint() method).

Found by turning on the CONFIG_PROVE_LOCKING ("Lock debugging: prove
locking correctness") build option. However, it is not clear how the
LOCKDEP reported paths can lead to a deadlock since the two paths,
suspending a target and loading a target, never occur at the same time.
But that doesn't mean the same lock-inversion couldn't have occurred
elsewhere.

Reported-by: Marian Csontos
Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer
Cc: stable@vger.kernel.org

Joe Thornber
2014-04-05 02:53:05 +0800

28 Mar, 2014

2 commits

64ab346a3 dm cache: remove remainder of distinct discard block size ... Browse Code »
13

Discard block size not being equal to cache block size causes data
corruption by erroneously avoiding migrations in issue_copy() because
the discard state is being cleared for a group of cache blocks when it
should not.

Completely remove all code that enabled a distinction between the
cache block size and discard block size.

Signed-off-by: Heinz Mauelshagen
Signed-off-by: Mike Snitzer

Heinz Mauelshagen
2014-03-28 04:56:23 +0800
d132cc6d9 dm cache: prevent corruption caused by discard_block_size > cache_block_size ... Browse Code »
18

If the discard block size is larger than the cache block size we will
not properly quiesce IO to a region that is about to be discarded. This
results in a race between a cache migration where no copy is needed, and
a write to an adjacent cache block that's within the same large discard
block.

Workaround this by limiting the discard_block_size to cache_block_size.
Also limit the max_discard_sectors to cache_block_size.

A more comprehensive fix that introduces range locking support in the
bio_prison and proper quiescing of a discard range that spans multiple
cache blocks is already in development.

Reported-by: Morgan Mears
Signed-off-by: Mike Snitzer
Acked-by: Joe Thornber
Acked-by: Heinz Mauelshagen
Cc: stable@vger.kernel.org

Mike Snitzer
2014-03-28 04:56:23 +0800

13 Mar, 2014

2 commits

e893fba90 dm cache: fix access beyond end of origin device ... Browse Code »

In order to avoid wasting cache space a partial block at the end of the
origin device is not cached. Unfortunately, the check for such a
partial block at the end of the origin device was flawed.

Fix accesses beyond the end of the origin device that occured due to
attempted promotion of an undetected partial block by:

- initializing the per bio data struct to allow cache_end_io to work properly
- recognizing access to the partial block at the end of the origin device
- avoiding out of bounds access to the discard bitset

Otherwise, users can experience errors like the following:

attempt to access beyond end of device
dm-5: rw=0, want=20971520, limit=20971456
...
device-mapper: cache: promotion failed; couldn't copy block

Signed-off-by: Heinz Mauelshagen
Acked-by: Joe Thornber
Signed-off-by: Mike Snitzer
Cc: stable@vger.kernel.org

Heinz Mauelshagen
2014-03-13 01:52:00 +0800
8b9d96666 dm cache: fix truncation bug when copying a block to/from >2TB fast device ... Browse Code »

During demotion or promotion to a cache's >2TB fast device we must not
truncate the cache block's associated sector to 32bits. The 32bit
temporary result of from_cblock() caused a 32bit multiplication when
calculating the sector of the fast device in issue_copy_real().

Use an intermediate 64bit type to store the 32bit from_cblock() to allow
for proper 64bit multiplication.

Here is an example of how this bug manifests on an ext4 filesystem:

EXT4-fs error (device dm-0): ext4_mb_generate_buddy:756: group 17136, 32768 clusters in bitmap, 30688 in gd; block bitmap corrupt.
JBD2: Spotted dirty metadata buffer (dev = dm-0, blocknr = 0). There's a risk of filesystem corruption in case of system crash.

Signed-off-by: Heinz Mauelshagen
Acked-by: Joe Thornber
Signed-off-by: Mike Snitzer
Cc: stable@vger.kernel.org

Heinz Mauelshagen
2014-03-13 01:49:27 +0800

28 Feb, 2014

1 commit

e0d849fad dm cache: fix truncation bug when mapping I/O to >2TB fast device ... Browse Code »

When remapping a block to the cache's fast device that is larger than
2TB we must not truncate the destination sector to 32bits. The 32bit
temporary result of from_cblock() was being overflowed in
remap_to_cache() due to the logical left shift.

Use an intermediate 64bit type to store the 32bit from_cblock() result
to fix the overflow.

Signed-off-by: Heinz Mauelshagen
Signed-off-by: Mike Snitzer
Cc: stable@vger.kernel.org

Heinz Mauelshagen
2014-02-28 22:23:02 +0800

18 Feb, 2014

2 commits

80ae49aae dm cache: do not add migration to completed list before unhooking bio ... Browse Code »

When completing an overwrite bio, in overwrite_endio(), the associated
migration should not be added to the 'completed_migrations' until the
bio's fields are restored with dm_unhook_bio().

Otherwise, do_worker() can race to process 'completed_migrations' before
dm_unhook_bio() -- so the bio's bi_end_io is incorrect. This is
unlikely to cause any problems given the current code but should be
fixed on the basis of correctness.

Also, the cache's spinlock only needs to be held when manipulating the
'completed_migrations' list -- other changes don't need protection.

Signed-off-by: Mike Snitzer
Acked-by: Joe Thornber

Mike Snitzer
2014-02-18 00:00:05 +0800
c6eda5e81 dm cache: move hook_info into common portion of per_bio_data structure ... Browse Code »

Commit c9d28d5d ("dm cache: promotion optimisation for writes")
incorrectly placed the 'hook_info' member in the writethrough-only
portion of the per_bio_data structure.

Given that the overwrite optimization may be used for writeback the
'hook_info' member must be placed above the 'cache' member of the
per_bio_data structure. Any members above 'cache' are available from
both writeback and writethrough modes' per_bio_data structure.

Signed-off-by: Mike Snitzer
Acked-by: Joe Thornber
Cc: stable@vger.kernel.org # 3.13+

Mike Snitzer
2014-02-18 00:00:05 +0800

31 Jan, 2014

1 commit

f568849ed Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-block ... Browse Code »

Pull core block IO changes from Jens Axboe:
"The major piece in here is the immutable bio_ve series from Kent, the
rest is fairly minor. It was supposed to go in last round, but
various issues pushed it to this release instead. The pull request
contains:

- Various smaller blk-mq fixes from different folks. Nothing major
here, just minor fixes and cleanups.

- Fix for a memory leak in the error path in the block ioctl code
from Christian Engelmayer.

- Header export fix from CaiZhiyong.

- Finally the immutable biovec changes from Kent Overstreet. This
enables some nice future work on making arbitrarily sized bios
possible, and splitting more efficient. Related fixes to immutable
bio_vecs:

- dm-cache immutable fixup from Mike Snitzer.
- btrfs immutable fixup from Muthu Kumar.

- bio-integrity fix from Nic Bellinger, which is also going to stable"

* 'for-3.14/core' of git://git.kernel.dk/linux-block: (44 commits)
xtensa: fixup simdisk driver to work with immutable bio_vecs
block/blk-mq-cpu.c: use hotcpu_notifier()
blk-mq: for_each_* macro correctness
block: Fix memory leak in rw_copy_check_uvector() handling
bio-integrity: Fix bio_integrity_verify segment start bug
block: remove unrelated header files and export symbol
blk-mq: uses page->list incorrectly
blk-mq: use __smp_call_function_single directly
btrfs: fix missing increment of bi_remaining
Revert "block: Warn and free bio if bi_end_io is not set"
block: Warn and free bio if bi_end_io is not set
blk-mq: fix initializing request's start time
block: blk-mq: don't export blk_mq_free_queue()
block: blk-mq: make blk_sync_queue support mq
block: blk-mq: support draining mq queue
dm cache: increment bi_remaining when bi_end_io is restored
block: fixup for generic bio chaining
block: Really silence spurious compiler warnings
block: Silence spurious compiler warnings
block: Kill bio_pair_split()
...

Linus Torvalds
2014-01-31 03:19:05 +0800

17 Jan, 2014

1 commit

2e68c4e6c dm cache: add policy name to status output ... Browse Code »

The cache's policy may have been established using the "default" alias,
which is currently the "mq" policy but the default policy may change in
the future. It is useful to know exactly which policy is being used.

Add a 'real' member to the dm_cache_policy_type structure and have the
"default" dm_cache_policy_type point to the real "mq"
dm_cache_policy_type. Update dm_cache_policy_get_name() to check if
real is set, if so report the name of the real policy (not the alias).

Requested-by: Jonathan Brassow
Signed-off-by: Mike Snitzer

Mike Snitzer
2014-01-17 02:44:11 +0800

10 Jan, 2014

1 commit

6a388618f dm cache: add block sizes and total cache blocks to status output ... Browse Code »

Improve cache_status to emit:
/
/
...

Adding the block sizes allows for easier calculation of the overall size
of both the metadata and cache devices. Adding
provides useful context for how much of the cache is used.

Unfortunately these additions to the status will require updates to
users' scripts that monitor the cache status. But these changes help
provide more comprehensive information about the cache device and will
simplify tools that are being developed to manage dm-cache devices --
because they won't need to issue 3 operations to cobble together the
information that we can easily provide via a single status ioctl.

While updating the status documentation in cache.txt spaces were
tabify'd.

Requested-by: Jonathan Brassow
Signed-off-by: Mike Snitzer
Acked-by: Joe Thornber

Mike Snitzer
2014-01-10 23:24:33 +0800

01 Jan, 2014

1 commit

b28bc9b38 Merge tag 'v3.13-rc6' into for-3.14/core ... Browse Code »

Needed to bring blk-mq uptodate, since changes have been going in
since for-3.14/core was established.

Fixup merge issues related to the immutable biovec changes.

Signed-off-by: Jens Axboe

Conflicts:
block/blk-flush.c
fs/btrfs/check-integrity.c
fs/btrfs/extent_io.c
fs/btrfs/scrub.c
fs/logfs/dev_bdev.c

Jens Axboe
2014-01-01 00:51:02 +0800

11 Dec, 2013

1 commit

088448007 dm cache: actually resize cache ... Browse Code »

Commit f494a9c6b1b6dd9a9f21bbb75d9210d478eeb498 ("dm cache: cache
shrinking support") broke cache resizing support.

dm_cache_resize() is called with cache->cache_size before it gets
updated to new_size, so it is a no-op. But the dm-cache superblock is
updated with the new_size even though the backing dm-array is not
resized. Fix this by passing the new_size to dm_cache_resize().

Signed-off-by: Vincent Pelletier
Acked-by: Joe Thornber
Signed-off-by: Mike Snitzer

Vincent Pelletier
2013-12-11 05:35:15 +0800

04 Dec, 2013

1 commit

8d3072691 dm cache: increment bi_remaining when bi_end_io is restored ... Browse Code »

Move the bio->bi_remaining increment into dm_unhook_bio() so the
overwrite_endio() handler works as expected.

Signed-off-by: Mike Snitzer
Signed-off-by: Jens Axboe

Mike Snitzer
2013-12-04 10:16:04 +0800

24 Nov, 2013

2 commits

196d38bcc block: Generic bio chaining ... Browse Code »

This adds a generic mechanism for chaining bio completions. This is
going to be used for a bio_split() replacement, and it turns out to be
very useful in a fair amount of driver code - a fair number of drivers
were implementing this in their own roundabout ways, often painfully.

Note that this means it's no longer to call bio_endio() more than once
on the same bio! This can cause problems for drivers that save/restore
bi_end_io. Arguably they shouldn't be saving/restoring bi_end_io at all
- in all but the simplest cases they'd be better off just cloning the
bio, and immutable biovecs is making bio cloning cheaper. But for now,
we add a bio_endio_nodec() for these cases.

Signed-off-by: Kent Overstreet
Cc: Jens Axboe

Kent Overstreet
2013-11-24 14:33:56 +0800
4f024f379 block: Abstract out bvec iterator ... Browse Code »
13

Immutable biovecs are going to require an explicit iterator. To
implement immutable bvecs, a later patch is going to add a bi_bvec_done
member to this struct; for now, this patch effectively just renames
things.

Signed-off-by: Kent Overstreet
Cc: Jens Axboe
Cc: Geert Uytterhoeven
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: "Ed L. Cashin"
Cc: Nick Piggin
Cc: Lars Ellenberg
Cc: Jiri Kosina
Cc: Matthew Wilcox
Cc: Geoff Levand
Cc: Yehuda Sadeh
Cc: Sage Weil
Cc: Alex Elder
Cc: ceph-devel@vger.kernel.org
Cc: Joshua Morris
Cc: Philip Kelleher
Cc: Rusty Russell
Cc: "Michael S. Tsirkin"
Cc: Konrad Rzeszutek Wilk
Cc: Jeremy Fitzhardinge
Cc: Neil Brown
Cc: Alasdair Kergon
Cc: Mike Snitzer
Cc: dm-devel@redhat.com
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Cc: linux390@de.ibm.com
Cc: Boaz Harrosh
Cc: Benny Halevy
Cc: "James E.J. Bottomley"
Cc: Greg Kroah-Hartman
Cc: "Nicholas A. Bellinger"
Cc: Alexander Viro
Cc: Chris Mason
Cc: "Theodore Ts'o"
Cc: Andreas Dilger
Cc: Jaegeuk Kim
Cc: Steven Whitehouse
Cc: Dave Kleikamp
Cc: Joern Engel
Cc: Prasad Joshi
Cc: Trond Myklebust
Cc: KONISHI Ryusuke
Cc: Mark Fasheh
Cc: Joel Becker
Cc: Ben Myers
Cc: xfs@oss.sgi.com
Cc: Steven Rostedt
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Len Brown
Cc: Pavel Machek
Cc: "Rafael J. Wysocki"
Cc: Herton Ronaldo Krzesinski
Cc: Ben Hutchings
Cc: Andrew Morton
Cc: Guo Chao
Cc: Tejun Heo
Cc: Asai Thambi S P
Cc: Selvan Mani
Cc: Sam Bradshaw
Cc: Wei Yongjun
Cc: "Roger Pau Monné"
Cc: Jan Beulich
Cc: Stefano Stabellini
Cc: Ian Campbell
Cc: Sebastian Ott
Cc: Christian Borntraeger
Cc: Minchan Kim
Cc: Jiang Liu
Cc: Nitin Gupta
Cc: Jerome Marchand
Cc: Joe Perches
Cc: Peng Tao
Cc: Andy Adamson
Cc: fanchaoting
Cc: Jie Liu
Cc: Sunil Mushran
Cc: "Martin K. Petersen"
Cc: Namjae Jeon
Cc: Pankaj Kumar
Cc: Dan Magenheimer
Cc: Mel Gorman 6

Kent Overstreet
2013-11-24 14:33:47 +0800

13 Nov, 2013

1 commit

7b6b2bc98 dm cache: resolve small nits and improve Documentation ... Browse Code »

Document passthrough mode, cache shrinking, and cache invalidation.
Also, use strcasecmp() and hlist_unhashed().

Reported-by: Alasdair G Kergon
Signed-off-by: Mike Snitzer

Mike Snitzer
2013-11-13 02:11:09 +0800

12 Nov, 2013

3 commits

65790ff91 dm cache: add cache block invalidation support ... Browse Code »

Cache block invalidation is removing an entry from the cache without
writing it back. Cache blocks can be invalidated via the
'invalidate_cblocks' message, which takes an arbitrary number of cblock
ranges:
invalidate_cblocks [|-]*

E.g.
dmsetup message my_cache 0 invalidate_cblocks 2345 3456-4567 5678-6789

Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer

Joe Thornber
2013-11-12 00:37:51 +0800
2ee57d587 dm cache: add passthrough mode ... Browse Code »
19

"Passthrough" is a dm-cache operating mode (like writethrough or
writeback) which is intended to be used when the cache contents are not
known to be coherent with the origin device. It behaves as follows:

* All reads are served from the origin device (all reads miss the cache)
* All writes are forwarded to the origin device; additionally, write
hits cause cache block invalidates

This mode decouples cache coherency checks from cache device creation,
largely to avoid having to perform coherency checks while booting. Boot
scripts can create cache devices in passthrough mode and put them into
service (mount cached filesystems, for example) without having to worry
about coherency. Coherency that exists is maintained, although the
cache will gradually cool as writes take place.

Later, applications can perform coherency checks, the nature of which
will depend on the type of the underlying storage. If coherency can be
verified, the cache device can be transitioned to writethrough or
writeback mode while still warm; otherwise, the cache contents can be
discarded prior to transitioning to the desired operating mode.

Signed-off-by: Joe Thornber
Signed-off-by: Heinz Mauelshagen
Signed-off-by: Morgan Mears
Signed-off-by: Mike Snitzer

Joe Thornber
2013-11-12 00:37:49 +0800
f494a9c6b dm cache: cache shrinking support ... Browse Code »

Allow a cache to shrink if the blocks being removed from the cache are
not dirty.

Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer

Joe Thornber
2013-11-12 00:37:45 +0800

10 Nov, 2013

8 commits

c9d28d5d0 dm cache: promotion optimisation for writes ... Browse Code »

If a write block triggers promotion and covers a whole block we can
avoid a copy.

Introduce dm_{hook,unhook}_bio to simplify saving and restoring bio
fields (bi_private is now used by overwrite). Switch writethrough
support over to using these helpers too.

Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer

Joe Thornber
2013-11-10 07:20:26 +0800
ffcbcb672 dm cache: optimize commit_if_needed ... Browse Code »

Check commit_requested flag _before_ calling
dm_cache_changed_this_transaction() superfluously.

Also, be sure to set last_commit_jiffies _after_ dm_cache_commit()
completes.

Signed-off-by: Heinz Mauelshagen
Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer

Heinz Mauelshagen
2013-11-10 07:20:24 +0800
2c2263c93 dm cache: log error message if dm_kcopyd_copy() fails ... Browse Code »

A migration failure should be logged (albeit limited).

Signed-off-by: Heinz Mauelshagen
Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer

Heinz Mauelshagen
2013-11-10 07:20:19 +0800
80f659f3f dm cache: use cell_defer() boolean argument consistently ... Browse Code »

Fix a few cell_defer() calls that weren't passing a bool.

Signed-off-by: Heinz Mauelshagen
Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer

Heinz Mauelshagen
2013-11-10 07:20:19 +0800
4cb3e1db2 dm cache: return -EINVAL if the user specifies unknown cache policy ... Browse Code »

Return -EINVAL when the specified cache policy is unknown rather than
returning -ENOMEM.

Signed-off-by: Mikulas Patocka
Signed-off-by: Mike Snitzer

Mikulas Patocka
2013-11-10 07:20:18 +0800
238f8363b dm cache: improve efficiency of quiescing flag management ... Browse Code »

Make the quiescing flag an atomic_t and stop protecting it with a spin
lock.

Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer

Joe Thornber
2013-11-10 07:19:59 +0800
66cb1910d dm cache: fix a race condition between queuing new migrations and quiescing for a shutdown ... Browse Code »
2

The code that was trying to do this was inadequate. The postsuspend
method (in ioctl context), needs to wait for the worker thread to
acknowledge the request to quiesce. Otherwise the migration count may
drop to zero temporarily before the worker thread realises we're
quiescing. In this case the target will be taken down, but the worker
thread may have issued a new migration, which will cause an oops when
it completes.

Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer
Cc: stable@vger.kernel.org # 3.9+

Joe Thornber
2013-11-10 06:55:50 +0800
f8e5f01a3 dm cache: io destined for the cache device can now serve as tick bios ... Browse Code »

Previously only origin bios could trigger ticks, which meant if all
the io was destined for the cache no ticks were generated. If no ticks
are generated then multiple hits, and movements in general, are
attributed to the same tick.

Only a stop gap fix, we need a better solution.

Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer

Joe Thornber
2013-11-10 06:55:49 +0800