06 Dec, 2019

1 commit

  • Add support for one pre-commit callback which is run right before the
    metadata are committed.

    This allows the thin provisioning target to run a callback before the
    metadata are committed and is required by the next commit.

    Cc: stable@vger.kernel.org
    Signed-off-by: Nikos Tsironis
    Acked-by: Joe Thornber
    Signed-off-by: Mike Snitzer

    Nikos Tsironis
     

16 Jan, 2019

1 commit

  • Commit 00a0ea33b495 ("dm thin: do not queue freed thin mapping for next
    stage processing") changed process_prepared_discard_passdown_pt1() to
    increment all the blocks being discarded until after the passdown had
    completed to avoid them being prematurely reused.

    IO issued to a thin device that breaks sharing with a snapshot, followed
    by a discard issued to snapshot(s) that previously shared the block(s),
    results in passdown_double_checking_shared_status() being called to
    iterate through the blocks double checking their reference count is zero
    and issuing the passdown if so. So a side effect of commit 00a0ea33b495
    is passdown_double_checking_shared_status() was broken.

    Fix this by checking if the block reference count is greater than 1.
    Also, rename dm_pool_block_is_used() to dm_pool_block_is_shared().

    Fixes: 00a0ea33b495 ("dm thin: do not queue freed thin mapping for next stage processing")
    Cc: stable@vger.kernel.org # 4.9+
    Reported-by: ryan.p.norwood@gmail.com
    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer

    Joe Thornber
     

21 Jul, 2016

1 commit

  • The discard passdown was being issued after the block was unmapped,
    which meant the block could be reprovisioned whilst the passdown discard
    was still in flight.

    We can only identify unshared blocks (safe to do a passdown a discard
    to) once they're unmapped and their ref count hits zero. Block ref
    counts are now used to guard against concurrent allocation of these
    blocks that are being discarded. So now we unmap the block, issue
    passdown discards, and the immediately increment ref counts for regions
    that have been discarded via passed down (this is safe because
    allocation occurs within the same thread). We then decrement ref counts
    once the passdown discard IO is complete -- signaling these blocks may
    now be allocated.

    This fixes the potential for corruption that was reported here:
    https://www.redhat.com/archives/dm-devel/2016-June/msg00311.html

    Reported-by: Dennis Yang
    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer

    Joe Thornber
     

12 Jun, 2015

2 commits


10 Feb, 2015

1 commit


11 Nov, 2014

2 commits


06 Mar, 2014

1 commit

  • If a thin metadata operation fails the current transaction will abort,
    whereby causing potential for IO layers up the stack (e.g. filesystems)
    to have data loss. As such, set THIN_METADATA_NEEDS_CHECK_FLAG in the
    thin metadata's superblock which:
    1) requires the user verify the thin metadata is consistent (e.g. use
    thin_check, etc)
    2) suggests the user verify the thin data is consistent (e.g. use fsck)

    The only way to clear the superblock's THIN_METADATA_NEEDS_CHECK_FLAG is
    to run thin_repair.

    On metadata operation failure: abort current metadata transaction, set
    pool in read-only mode, and now set the needs_check flag.

    As part of this change, constraints are introduced or relaxed:
    * don't allow a pool to transition to write mode if needs_check is set
    * don't allow data or metadata space to be resized if needs_check is set
    * if a thin pool's metadata space is exhausted: the kernel will now
    force the user to take the pool offline for repair before the kernel
    will allow the metadata space to be extended.

    Also, update Documentation to include information about when the thin
    provisioning target commits metadata, how it handles metadata failures
    and running out of space.

    Signed-off-by: Mike Snitzer
    Signed-off-by: Joe Thornber

    Mike Snitzer
     

28 Feb, 2014

1 commit

  • It was always intended that a user could provide a thin metadata device
    that is larger than the max supported by the on-disk format. The extra
    space would just go unused.

    Unfortunately that never worked. If the user attempted to use a larger
    metadata device on creation they would get an error like the following:

    device-mapper: space map common: space map too large
    device-mapper: transaction manager: couldn't create metadata space map
    device-mapper: thin metadata: tm_create_with_sm failed
    device-mapper: table: 252:17: thin-pool: Error creating metadata object
    device-mapper: ioctl: error adding target to table

    Fix this by allowing the initial metadata space map creation to cap its
    size at the max number of blocks supported (DM_SM_METADATA_MAX_BLOCKS).
    get_metadata_dev_size() must also impose DM_SM_METADATA_MAX_BLOCKS (via
    THIN_METADATA_MAX_SECTORS), otherwise extending metadata would cap at
    THIN_METADATA_MAX_SECTORS_WARNING (which is larger than supported).

    Also, the calculation for THIN_METADATA_MAX_SECTORS didn't account for
    the sizeof the disk_bitmap_header. So the supported maximum metadata
    size is a bit smaller (reduced from 33423360 to 33292800 sectors).

    Lastly, remove the "excess space will not be used" warning message from
    get_metadata_dev_size(); it resulted in printing the warning multiple
    times. Factor out warn_if_metadata_device_too_big(), call it from
    pool_ctr() and maybe_resize_metadata_dev().

    Signed-off-by: Mike Snitzer
    Acked-by: Joe Thornber

    Mike Snitzer
     

18 Feb, 2014

1 commit

  • Commit 905e51b ("dm thin: commit outstanding data every second")
    introduced a periodic commit. This commit occurs regardless of whether
    any thin devices have made changes.

    Fix the periodic commit to check if any of a pool's thin devices have
    changed using dm_pool_changed_this_transaction().

    Reported-by: Alexander Larsson
    Signed-off-by: Mike Snitzer
    Acked-by: Joe Thornber
    Cc: stable@vger.kernel.org

    Mike Snitzer
     

07 Jan, 2014

2 commits

  • Also, move 'err' member in dm_thin_new_mapping structure to eliminate 4
    byte hole (reduces size from 88 bytes to 80).

    Signed-off-by: Mike Snitzer
    Acked-by: Joe Thornber

    Mike Snitzer
     
  • If a snapshot is created and later deleted the origin dm_thin_device's
    snapshotted_time will have been updated to reflect the snapshot's
    creation time. The 'shared' flag in the dm_thin_lookup_result struct
    returned from dm_thin_find_block() is an approximation based on
    snapshotted_time -- this is done to avoid 0(n), or worse, time
    complexity. In this case, the shared flag would be true.

    But because the 'shared' flag reflects an approximation a block can be
    incorrectly assumed to be shared (e.g. false positive for 'shared'
    because the snapshot no longer exists). This could result in discards
    issued to a thin device not being passed down to the pool's underlying
    data device.

    To fix this we double check that a thin block is really still in-use
    after a mapping is removed using dm_pool_block_is_used(). If the
    reference count for a block is now zero the discard is allowed to be
    passed down.

    Also add a 'definitely_not_shared' member to the dm_thin_new_mapping
    structure -- reflects that the 'shared' flag in the response from
    dm_thin_find_block() can only be held as definitive if false is
    returned.

    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1043527

    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Cc: stable@vger.kernel.org

    Joe Thornber
     

11 Dec, 2013

1 commit

  • A thin-pool may be in read-only mode because the pool's data or metadata
    space was exhausted. To allow for recovery, by adding more space to the
    pool, we must allow a pool to transition from PM_READ_ONLY to PM_WRITE
    mode. Otherwise, running out of space will render the pool permanently
    read-only.

    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Cc: stable@vger.kernel.org

    Joe Thornber
     

10 May, 2013

2 commits


27 Jul, 2012

5 commits


03 Jun, 2012

1 commit

  • This patch implements two new messages that can be sent to the thin
    pool target allowing it to take a snapshot of the _metadata_. This,
    read-only snapshot can be accessed by userland, concurrently with the
    live target.

    Only one metadata snapshot can be held at a time. The pool's status
    line will give the block location for the current msnap.

    Since version 0.1.5 of the userland thin provisioning tools, the
    thin_dump program displays the msnap as follows:

    thin_dump -m

    Available here: https://github.com/jthornber/thin-provisioning-tools

    Now that userland can access the metadata we can do various things
    that have traditionally been kernel side tasks:

    i) Incremental backups.

    By using metadata snapshots we can work out what blocks have
    changed over time. Combined with data snapshots we can ensure
    the data doesn't change while we back it up.

    A short proof of concept script can be found here:

    https://github.com/jthornber/thinp-test-suite/blob/master/incremental_backup_example.rb

    ii) Migration of thin devices from one pool to another.

    iii) Merging snapshots back into an external origin.

    iv) Asyncronous replication.

    Signed-off-by: Joe Thornber
    Signed-off-by: Alasdair G Kergon

    Joe Thornber
     

29 Mar, 2012

1 commit

  • The thin metadata format can only make use of a device that is = 1 GB, physical extents).

    Rather than reject a larger metadata device, during thin-pool device
    construction, switch to allowing it but issue a warning if a device
    larger than THIN_METADATA_MAX_SECTORS_WARNING (16 GB) is
    provided. Any space over 15.9375 GB will not be used.

    Signed-off-by: Mike Snitzer
    Signed-off-by: Alasdair G Kergon

    Mike Snitzer
     

01 Nov, 2011

1 commit

  • Initial EXPERIMENTAL implementation of device-mapper thin provisioning
    with snapshot support. The 'thin' target is used to create instances of
    the virtual devices that are hosted in the 'thin-pool' target. The
    thin-pool target provides data sharing among devices. This sharing is
    made possible using the persistent-data library in the previous patch.

    The main highlight of this implementation, compared to the previous
    implementation of snapshots, is that it allows many virtual devices to
    be stored on the same data volume, simplifying administration and
    allowing sharing of data between volumes (thus reducing disk usage).

    Another big feature is support for arbitrary depth of recursive
    snapshots (snapshots of snapshots of snapshots ...). The previous
    implementation of snapshots did this by chaining together lookup tables,
    and so performance was O(depth). This new implementation uses a single
    data structure so we don't get this degradation with depth.

    For further information and examples of how to use this, please read
    Documentation/device-mapper/thin-provisioning.txt

    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Signed-off-by: Alasdair G Kergon

    Joe Thornber