Eric Lee / smarc-fsl-linux-kernel

02 Aug, 2011

3 commits

a6e50b409 dm snapshot: skip reading origin when overwriting complete chunk ... Browse Code »

If we write a full chunk in the snapshot, skip reading the origin device
because the whole chunk will be overwritten anyway.

This patch changes the snapshot write logic when a full chunk is written.
In this case:
1. allocate the exception
2. dispatch the bio (but don't report the bio completion to device mapper)
3. write the exception record
4. report bio completed

Callbacks must be done through the kcopyd thread, because callbacks must not
race with each other. So we create two new functions:

dm_kcopyd_prepare_callback: allocate a job structure and prepare the callback.
(This function must not be called from interrupt context.)

dm_kcopyd_do_callback: submit callback.
(This function may be called from interrupt context.)

Performance test (on snapshots with 4k chunk size):
without the patch:
non-direct-io sequential write (dd): 17.7MB/s
direct-io sequential write (dd): 20.9MB/s
non-direct-io random write (mkfs.ext2): 0.44s

with the patch:
non-direct-io sequential write (dd): 26.5MB/s
direct-io sequential write (dd): 33.2MB/s
non-direct-io random write (mkfs.ext2): 0.27s

Signed-off-by: Mikulas Patocka
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2011-08-02 19:32:04 +0800
a2d2b0345 dm snapshot: style cleanups ... Browse Code »

Coding style cleanups.

Signed-off-by: Alasdair G Kergon
Signed-off-by: Jonathan Brassow

Jonathan Brassow
2011-08-02 19:32:03 +0800
aa3f0794d dm snapshot: remove unused definitions ... Browse Code »

Remove a couple of unused #defines.

Signed-off-by: Mikulas Patocka
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2011-08-02 19:32:03 +0800

29 May, 2011

2 commits

fa34ce730 dm kcopyd: return client directly and not through a pointer ... Browse Code »

Return client directly from dm_kcopyd_client_create, not through a
parameter, making it consistent with dm_io_client_create.

Signed-off-by: Mikulas Patocka
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2011-05-29 20:03:13 +0800
5f43ba295 dm kcopyd: reserve fewer pages ... Browse Code »

Reserve just the minimum of pages needed to process one job.

Because we allocate pages from page allocator, we don't need to reserve
a large number of pages. The maximum job size is SUB_JOB_SIZE and we
calculate the number of reserved pages based on this.

Signed-off-by: Mikulas Patocka
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2011-05-29 20:03:11 +0800

24 Mar, 2011

1 commit

024d37e95 dm: fix opening log and cow devices for read only tables ... Browse Code »

If a table is read-only, also open any log and cow devices it uses read-only.

Previously, even read-only devices were opened read-write internally.
After patch 75f1dc0d076d1c1168f2115f1941ea627d38bd5a
block: check bdev_read_only() from blkdev_get()
was applied, loading such tables began to fail. The patch
was reverted by e51900f7d38cbcfb481d84567fd92540e7e1d23a
block: revert block_dev read-only check
but this patch fixes this part of the code to work with the original patch.

Signed-off-by: Milan Broz
Signed-off-by: Alasdair G Kergon

Milan Broz
2011-03-24 21:52:14 +0800

14 Jan, 2011

2 commits

b83b2f295 dm snapshot: avoid storing private suspended state ... Browse Code »

Use dm_suspended() rather than having each snapshot target maintain a
private 'suspended' flag in struct dm_snapshot.

Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2011-01-14 03:59:59 +0800
fecec20e5 dm snapshot: remove unused dm_snapshot queued_bios_work ... Browse Code »

dm_snapshot->queued_bios_work isn't used. Remove ->queued_bios[_work]
from dm_snapshot structure, the flush_queued_bios work function and
ksnapd workqueue.

The DM snapshot changes that were going to use the ksnapd workqueue were
either superseded (fix for origin write races) or never completed
(deallocation of invalid snapshot's memory via workqueue).

Signed-off-by: Tejun Heo
Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Tejun Heo
2011-01-14 03:59:56 +0800

23 Oct, 2010

1 commit

a2887097f Merge branch 'for-2.6.37/barrier' of git://git.kernel.dk/linux-2.6-block ... Browse Code »

* 'for-2.6.37/barrier' of git://git.kernel.dk/linux-2.6-block: (46 commits)
xen-blkfront: disable barrier/flush write support
Added blk-lib.c and blk-barrier.c was renamed to blk-flush.c
block: remove BLKDEV_IFL_WAIT
aic7xxx_old: removed unused 'req' variable
block: remove the BH_Eopnotsupp flag
block: remove the BLKDEV_IFL_BARRIER flag
block: remove the WRITE_BARRIER flag
swap: do not send discards as barriers
fat: do not send discards as barriers
ext4: do not send discards as barriers
jbd2: replace barriers with explicit flush / FUA usage
jbd2: Modify ASYNC_COMMIT code to not rely on queue draining on barrier
jbd: replace barriers with explicit flush / FUA usage
nilfs2: replace barriers with explicit flush / FUA usage
reiserfs: replace barriers with explicit flush / FUA usage
gfs2: replace barriers with explicit flush / FUA usage
btrfs: replace barriers with explicit flush / FUA usage
xfs: replace barriers with explicit flush / FUA usage
block: pass gfp_mask and flags to sb_issue_discard
dm: convey that all flushes are processed as empty
...

Linus Torvalds
2010-10-23 08:07:18 +0800

11 Sep, 2010

1 commit

c8bf13368 Consolidate min_not_zero ... Browse Code »

We have several users of min_not_zero, each of them using their own
definition. Move the define to kernel.h.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2010-09-11 02:07:38 +0800

10 Sep, 2010

1 commit

d87f4c14f dm: implement REQ_FLUSH/FUA support for bio-based dm ... Browse Code »

This patch converts bio-based dm to support REQ_FLUSH/FUA instead of
now deprecated REQ_HARDBARRIER.

* -EOPNOTSUPP handling logic dropped.

* Preflush is handled as before but postflush is dropped and replaced
with passing down REQ_FUA to member request_queues. This replaces
one array wide cache flush w/ member specific FUA writes.

* __split_and_process_bio() now calls __clone_and_map_flush() directly
for flushes and guarantees all FLUSH bio's going to targets are zero
` length.

* It's now guaranteed that all FLUSH bio's which are passed onto dm
targets are zero length. bio_empty_barrier() tests are replaced
with REQ_FLUSH tests.

* Empty WRITE_BARRIERs are replaced with WRITE_FLUSHes.

* Dropped unlikely() around REQ_FLUSH tests. Flushes are not unlikely
enough to be marked with unlikely().

* Block layer now filters out REQ_FLUSH/FUA bio's if the request_queue
doesn't support cache flushing. Advertise REQ_FLUSH | REQ_FUA
capability.

* Request based dm isn't converted yet. dm_init_request_based_queue()
resets flush support to 0 for now. To avoid disturbing request
based dm code, dm->flush_error is added for bio based dm while
requested based dm continues to use dm->barrier_error.

Lightly tested linear, stripe, raid1, snap and crypt targets. Please
proceed with caution as I'm not familiar with the code base.

Signed-off-by: Tejun Heo
Cc: dm-devel@redhat.com
Cc: Christoph Hellwig
Signed-off-by: Jens Axboe

Tejun Heo
2010-09-10 18:35:38 +0800

12 Aug, 2010

4 commits

57cba5d36 dm: rename map_info flush_request to target_request_nr ... Browse Code »

'target_request_nr' is a more generic name that reflects the fact that
it will be used for both flush and discard support.

Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2010-08-12 11:14:04 +0800
b1d555283 dm snapshot: implement merge ... Browse Code »

Implement merge method for the snapshot origin to improve read
performance.

Without merge method, dm asks the upper layers to submit smallest possible
bios --- one page. Submitting such small bios impacts performance negatively
when reading or writing the origin device.

Without this patch, CPU consumption when reading the origin on lvm on md-raid0
was 6 to 12%, with this patch, it drops to 1 to 4%.

Note: in my testing, it actually degraded performance in some settings, I
traced it to Maxtor disks having problems with > 512-sector requests.
Reducing the number of sectors to /sys/block/sd*/queue/max_sectors_kb to
256 fixed the read performance. I think we don't have to care about weird
disks that actually degrade performance because of large requests being
sent to them.

Signed-off-by: Mikulas Patocka
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2010-08-12 11:14:02 +0800
c24110450 dm snapshot: test chunk size against both origin and snapshot ... Browse Code »

Validate chunk size against both origin and snapshot sector size

Don't allow chunk size smaller than either origin or snapshot logical
sector size. Reading or writing data not aligned to sector size is not
allowed and causes immediate errors.

This requires us to open the origin before initialising the
exception store and to export dm_snap_origin.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka
Reviewed-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2010-08-12 11:13:51 +0800
1e5554c84 dm snapshot: iterate origin and cow devices ... Browse Code »

Iterate both origin and snapshot devices

iterate_devices method should call the callback for all the devices where
the bio may be remapped. Thus, snapshot_iterate_devices should call the callback
for both snapshot and origin underlying devices because it remaps some bios
to the snapshot and some to the origin.

snapshot_iterate_devices called the callback only for the origin device.
This led to badly calculated device limits if snapshot and origin were placed
on different types of disks.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka
Reviewed-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2010-08-12 11:13:50 +0800

06 Mar, 2010

2 commits

924e600d4 dm: eliminate some holes data structures ... Browse Code »

Eliminate a 4-byte hole in 'struct dm_io_memory' by moving 'offset' above the
'ptr' to which it applies (size reduced from 24 to 16 bytes). And by
association, 1-4 byte hole is eliminated in 'struct dm_io_request' (size
reduced from 56 to 48 bytes).

Eliminate all 6 4-byte holes and 1 cache-line in 'struct dm_snapshot' (size
reduced from 392 to 368 bytes).

Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2010-03-06 10:32:33 +0800
8215d6ec5 dm table: remove unused dm_get_device range parameters ... Browse Code »

Remove unused parameters(start and len) of dm_get_device()
and fix the callers.

Signed-off-by: Nikanth Karthikesan
Signed-off-by: Alasdair G Kergon

Nikanth Karthikesan
2010-03-06 10:32:27 +0800

11 Dec, 2009

23 commits

d2fdb776e dm snapshot: use merge origin if snapshot invalid ... Browse Code »

If the snapshot we are merging became invalid (e.g. it ran out of
space) redirect all I/O directly to the origin device.

Signed-off-by: Mikulas Patocka
Reviewed-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2009-12-11 07:52:36 +0800
d8ddb1cff dm snapshot: report merge failure in status ... Browse Code »

Set 'merge_failed' flag if a snapshot fails to merge. Update
snapshot_status() to report "Merge failed" if 'merge_failed' is set.

Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2009-12-11 07:52:35 +0800
8a2d52862 dm snapshot: merge consecutive chunks together ... Browse Code »

s->store->type->prepare_merge returns the number of chunks that can be
copied linearly working backwards from the returned chunk number.

For example, if it returns 3 chunks with old_chunk == 10 and new_chunk
== 20, then chunk 20 can be copied to 10, chunk 19 to 9 and 18 to 8.

Until now kcopyd only copied one chunk at a time. This patch now copies
the full set at once.

Consequently, snapshot_merge_process() needs to delay the merging of all
chunks if any have writes in progress, not just the first chunk in the
region that is to be merged.

snapshot-merge's performance is now comparable to the original
snapshot-origin target.

Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2009-12-11 07:52:34 +0800
73dfd078c dm snapshot: trigger exceptions in remaining snapshots during merge ... Browse Code »

When there is one merging snapshot and other non-merging snapshots,
snapshot_merge_process() must make exceptions in the non-merging
snapshots.

Use a sequence count to resolve the race between I/O to chunks that are
about to be merged. The count increases each time an exception
reallocation finishes. Use wait_event() to wait until the count
changes.

Signed-off-by: Mikulas Patocka
Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2009-12-11 07:52:34 +0800
17aa03326 dm snapshot: delay merging a chunk until writes to it complete ... Browse Code »

Track writes to chunks that are currently being merged and delay merging
a chunk until all writes to that chunk finish.

Signed-off-by: Mikulas Patocka
Reviewed-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2009-12-11 07:52:33 +0800
9fe862548 dm snapshot: queue writes to chunks being merged ... Browse Code »

While a set of chunks is being merged, any overlapping writes need to be
queued.

Signed-off-by: Mikulas Patocka
Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2009-12-11 07:52:33 +0800
1e03f97e4 dm snapshot: add merging ... Browse Code »

Merging is started when origin is resumed and it is stopped when
origin is suspended or when the merging snapshot is destroyed or
errors are detected.

Merging is not yet interlocked with writes: this will be handled in
subsequent patches.

The code relies on callbacks from a private kcopyd thread.

Signed-off-by: Mikulas Patocka
Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2009-12-11 07:52:32 +0800
9d3b15c4c dm snapshot: permit only one merge at once ... Browse Code »

Merging more than one snapshot is not supported, so prevent
this happening.

Signed-off-by: Mikulas Patocka
Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2009-12-11 07:52:32 +0800
10b8106a7 dm snapshot: support barriers in snapshot merge target ... Browse Code »

Sets num_flush_requests=2 to support flushing both the origin and cow
devices used by the snapshot-merge target.

Also, snapshot_ctr() now gets the origin device using FMODE_WRITE if the
target is snapshot-merge (which writes to the origin device).

Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2009-12-11 07:52:31 +0800
3452c2a1e dm snapshot: avoid allocating exceptions in merge ... Browse Code »

The snapshot-merge target should not allocate new exceptions because the
intent is to merge all of its exceptions as quickly and safely as
possible.

This patch introduces the snapshot-merge mapping function and updates
__origin_write() so that it doesn't allocate exceptions on any snapshots
that are being merged.

If a write request to a merging snapshot device is to be dispatched
directly to the origin (because the chunk is not remapped or was already
merged), snapshot_merge_map() must make exceptions in other snapshots so
calls do_origin().

Signed-off-by: Mikulas Patocka
Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2009-12-11 07:52:31 +0800
515ad66cc dm snapshot: rework writing to origin ... Browse Code »

To track the completion of exceptions relating to the same location on
the device, the current code selects one exception as primary_pe, links
the other exceptions to it and uses reference counting to wait until all
the reallocations are complete.

It is considered too complicated to extend this code to handle the new
snapshot-merge target, where sets of non-overlapping chunks would also
need to become linked.

Instead, a simpler (but less efficient) approach is taken. Bios are
linked to one exception. When it completes, bios are simply retried,
and if other related exceptions are still outstanding, they'll get
queued again to wait for another one.

Signed-off-by: Mikulas Patocka
Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2009-12-11 07:52:30 +0800
d698aa450 dm snapshot: add merge target ... Browse Code »

The snapshot-merge target allows a snapshot to be merged back into the
snapshot's origin device.

One anticipated use of snapshot merging is the rollback of filesystems
to back out problematic system upgrades.

This patch adds snapshot-merge target management to both
dm_snapshot_init() and dm_snapshot_exit(). As an initial place-holder,
snapshot-merge is identical to the snapshot target. Documentation is
provided.

Signed-off-by: Mikulas Patocka
Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2009-12-11 07:52:30 +0800
615d1eb9c dm snapshot: create function for chunk_is_tracked wait ... Browse Code »

Move the __chunk_is_tracked() loop into a separate function as we will
also need to call it from the write path in the rare case of conflicting
writes to the same chunk.

Originally introduced in commit a8d41b59f3f5a7ac19452ef442a7fc1b5fa17366
("dm snapshot: fix race during exception creation").

Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2009-12-11 07:52:29 +0800
9eaae8ffb dm snapshot: make bio optional in __origin_write ... Browse Code »

To support the merging of snapshots back into their origin we need
to trigger exceptions in other snapshots not being merged without
any incoming bio on the origin device. The bio parameter to
__origin_write() becomes optional and the sector needs supplying
separately.

Signed-off-by: Mikulas Patocka
Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2009-12-11 07:52:28 +0800
c1f0c183f dm snapshot: allow live exception store handover between tables ... Browse Code »

Permit in-use snapshot exception data to be 'handed over' from one
snapshot instance to another. This is a pre-requisite for patches
that allow the changes made in a snapshot device to be merged back into
its origin device and also allows device resizing.

The basic call sequence is:

dmsetup load new_snapshot (referencing the existing in-use cow device)
- the ctr code detects that the cow is already in use and allows the
two snapshot target instances to be linked together
dmsetup suspend original_snapshot
dmsetup resume new_snapshot
- the new_snapshot becomes live, and if anything now tries to access
the original one it will receive -EIO
dmsetup remove original_snapshot

(There can only be two snapshot targets referencing the same cow device
simultaneously.)

Signed-off-by: Mike Snitzer
Signed-off-by: Mikulas Patocka
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2009-12-11 07:52:24 +0800
c26655ca3 dm snapshot: track suspended state in target ... Browse Code »

Keep track of whether or not the device is suspended within the snapshot
target module, the same as we do in dm-raid1.

We will use this later to enforce the correct sequence of ioctls to
transfer the in-core exceptions from a snapshot target instance in
one table to a replacement one capable of merging them back
into the origin.

Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2009-12-11 07:52:12 +0800
fc56f6fbc dm snapshot: move cow ref from exception store to snap core ... Browse Code »

Store the reference to the snapshot cow device in the core snapshot
code instead of each exception store. It can be accessed through the
new function dm_snap_cow(). Exception stores should each now maintain a
reference to their parent snapshot struct.

This is cleaner and makes part of the forthcoming snapshot merge code simpler.

Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon
Reviewed-by: Jonathan Brassow
Cc: Mikulas Patocka

Mike Snitzer
2009-12-11 07:52:12 +0800
985903bb3 dm snapshot: add allocated metadata to snapshot status ... Browse Code »

Add number of sectors used by metadata to the end of the snapshot's status
line.

Renamed dm_exception_store_type's 'fraction_full' to 'usage'. Renamed
arguments to be clearer about what is being returned. Also added
'metadata_sectors'.

Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2009-12-11 07:52:11 +0800
3510cb94f dm snapshot: rename exception functions ... Browse Code »

Rename exception functions. Preparing to pull them out of
dm-snap.c for broader use.

Signed-off-by: Jonathan Brassow
Reviewed-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Jon Brassow
2009-12-11 07:52:11 +0800
191437a53 dm snapshot: rename exception_table to dm_exception_table ... Browse Code »

Rename exception_table for broader use outside dm-snap.c

Signed-off-by: Jonathan Brassow
Reviewed-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Jon Brassow
2009-12-11 07:52:10 +0800
1d4989c85 dm snapshot: rename dm_snap_exception to dm_exception ... Browse Code »

The exception structure is not necessarily just a snapshot
element (especially after we pull it out of dm-snap.c).

Renaming appropriately.

Signed-off-by: Jonathan Brassow
Reviewed-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Jon Brassow
2009-12-11 07:52:10 +0800
d32a6ea65 dm snapshot: consolidate insert exception functions ... Browse Code »

Consolidate the insert_*exception functions. 'insert_completed_exception'
already contains all the logic to handle 'insert_exception' (via
check for a hash_shift of 0), so remove redundant function.

Signed-off-by: Jonathan Brassow
Reviewed-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Jon Brassow
2009-12-11 07:52:09 +0800
7e201b351 dm snapshot: abstract minimum_chunk_size fn ... Browse Code »

The origin needs to find minimum chunksize of all snapshots. This logic is
moved to a separate function because it will be used at another place in
the snapshot merge patches.

Signed-off-by: Mikulas Patocka
Reviewed-by: Mike Snitzer
Reviewed-by: Jonathan Brassow
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2009-12-11 07:52:08 +0800