18 Oct, 2016

1 commit

  • dm-raid 1.9.0 fails to activate existing RAID4/10 devices that have the
    old superblock format (which does not have takeover/reshaping support
    that was added via commit 33e53f06850f).

    Fix validation path for old superblocks by reverting to the old raid4
    layout and basing checks on mddev->new_{level,layout,...} members in
    super_init_validation().

    Cc: stable@vger.kernel.org # 4.8
    Signed-off-by: Heinz Mauelshagen
    Signed-off-by: Mike Snitzer

    Heinz Mauelshagen
     

08 Aug, 2016

1 commit

  • Since commit 63a4cc24867d, bio->bi_rw contains flags in the lower
    portion and the op code in the higher portions. This means that
    old code that relies on manually setting bi_rw is most likely
    going to be broken. Instead of letting that brokeness linger,
    rename the member, to force old and out-of-tree code to break
    at compile time instead of at runtime.

    No intended functional changes in this commit.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

15 Jun, 2016

1 commit


08 Jun, 2016

1 commit

  • To avoid confusion between REQ_OP_FLUSH, which is handled by
    request_fn drivers, and upper layers requesting the block layer
    perform a flush sequence along with possibly a WRITE, this patch
    renames REQ_FLUSH to REQ_PREFLUSH.

    Signed-off-by: Mike Christie
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Mike Christie
     

06 May, 2016

2 commits


11 Mar, 2016

1 commit

  • smq seems to be performing better than the old mq policy in all
    situations, as well as using a quarter of the memory.

    Make 'mq' an alias for 'smq' when choosing a cache policy. The tunables
    that were present for the old mq are faked, and have no effect. mq
    should be considered deprecated now.

    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer

    Joe Thornber
     

10 Dec, 2015

2 commits

  • If ignore_zero_blocks is enabled dm-verity will return zeroes for blocks
    matching a zero hash without validating the content.

    Signed-off-by: Sami Tolvanen
    Signed-off-by: Mike Snitzer

    Sami Tolvanen
     
  • Add support for correcting corrupted blocks using Reed-Solomon.

    This code uses RS(255, N) interleaved across data and hash
    blocks. Each error-correcting block covers N bytes evenly
    distributed across the combined total data, so that each byte is a
    maximum distance away from the others. This makes it possible to
    recover from several consecutive corrupted blocks with relatively
    small space overhead.

    In addition, using verity hashes to locate erasures nearly doubles
    the effectiveness of error correction. Being able to detect
    corrupted blocks also improves performance, because only corrupted
    blocks need to corrected.

    For a 2 GiB partition, RS(255, 253) (two parity bytes for each
    253-byte block) can correct up to 16 MiB of consecutive corrupted
    blocks if erasures can be located, and 8 MiB if they cannot, with
    16 MiB space overhead.

    Signed-off-by: Sami Tolvanen
    Signed-off-by: Mike Snitzer

    Sami Tolvanen
     

05 Nov, 2015

1 commit

  • Pull device mapper updates from Mike Snitzer:
    "Smaller set of DM changes for this merge. I've based these changes on
    Jens' for-4.4/reservations branch because the associated DM changes
    required it.

    - Revert a dm-multipath change that caused a regression for
    unprivledged users (e.g. kvm guests) that issued ioctls when a
    multipath device had no available paths.

    - Include Christoph's refactoring of DM's ioctl handling and add
    support for passing through persistent reservations with DM
    multipath.

    - All other changes are very simple cleanups"

    * tag 'dm-4.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
    dm switch: simplify conditional in alloc_region_table()
    dm delay: document that offsets are specified in sectors
    dm delay: capitalize the start of an delay_ctr() error message
    dm delay: Use DM_MAPIO macros instead of open-coded equivalents
    dm linear: remove redundant target name from error messages
    dm persistent data: eliminate unnecessary return values
    dm: eliminate unused "bioset" process for each bio-based DM device
    dm: convert ffs to __ffs
    dm: drop NULL test before kmem_cache_destroy() and mempool_destroy()
    dm: add support for passing through persistent reservations
    dm: refactor ioctl handling
    Revert "dm mpath: fix stalls when handling invalid ioctls"
    dm: initialize non-blk-mq queue data before queue is used

    Linus Torvalds
     

01 Nov, 2015

1 commit


10 Oct, 2015

1 commit

  • Commit 76c44f6d80 introduced the possibly for "Overflow" to be reported
    by the snapshot device's status. Older userspace (e.g. lvm2) does not
    handle the "Overflow" status response.

    Fix this incompatibility by requiring newer userspace code, that can
    cope with "Overflow", request the persistent store with overflow support
    by using "PO" (Persistent with Overflow) for the snapshot store type.

    Reported-by: Zdenek Kabelac
    Fixes: 76c44f6d80 ("dm snapshot: don't invalidate on-disk image on snapshot write overflow")
    Reviewed-by: Mikulas Patocka
    Signed-off-by: Mike Snitzer

    Mike Snitzer
     

01 Sep, 2015

1 commit


19 Aug, 2015

1 commit

  • If the user selected the precise_timestamps or histogram options, report
    it in the @stats_list message output.

    If the user didn't select these options, no extra tokens are reported,
    thus it is backward compatible with old software that doesn't know about
    precise timestamps and histogram.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Mike Snitzer
    Cc: stable@vger.kernel.org # 4.2

    Mikulas Patocka
     

16 Jul, 2015

2 commits


18 Jun, 2015

3 commits

  • Add an option to dm statistics to collect and report a histogram of
    IO latencies.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Mike Snitzer

    Mikulas Patocka
     
  • Make it possible to use precise timestamps with nanosecond granularity
    in dm statistics.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Mike Snitzer

    Mikulas Patocka
     
  • The Stochastic multiqueue (SMQ) policy (vs MQ) offers the promise of
    less memory utilization, improved performance and increased adaptability
    in the face of changing workloads. SMQ also does not have any
    cumbersome tuning knobs.

    Users may switch from "mq" to "smq" simply by appropriately reloading a
    DM table that is using the cache target. Doing so will cause all of the
    mq policy's hints to be dropped. Also, performance of the cache may
    degrade slightly until smq recalculates the origin device's hotspots
    that should be cached.

    In the future the "mq" policy will just silently make use of "smq" and
    the mq code will be removed.

    Signed-off-by: Mike Snitzer
    Acked-by: Joe Thornber

    Mike Snitzer
     

12 Jun, 2015

1 commit

  • If a cache metadata operation fails (e.g. transaction commit) the
    cache's metadata device will abort the current transaction, set a new
    needs_check flag, and the cache will transition to "read-only" mode. If
    aborting the transaction or setting the needs_check flag fails the cache
    will transition to "fail-io" mode.

    Once needs_check is set the cache device will not be allowed to
    activate. Activation requires write access to metadata. Future work is
    needed to add proper support for running the cache in read-only mode.

    Once in fail-io mode the cache will report a status of "Fail".

    Also, add commit() wrapper that will disallow commits if in read_only or
    fail mode.

    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer

    Joe Thornber
     

30 May, 2015

2 commits

  • Add dm-raid access to the MD RAID0 personality to enable single zone
    striping.

    The following changes enable that access:
    - add type definition to raid_types array
    - make bitmap creation conditonal in super_validate(), because
    bitmaps are not allowed in raid0
    - set rdev->sectors to the data image size in super_validate()
    to allow the raid0 personality to calculate the MD array
    size properly
    - use mdddev(un)lock() functions instead of direct mutex_(un)lock()
    (wrapped in here because it's a trivial change)
    - enhance raid_status() to always report full sync for raid0
    so that userspace checks for 100% sync will succeed and allow
    for resize (and takeover/reshape once added in future paches)
    - enhance raid_resume() to not load bitmap in case of raid0
    - add merge function to avoid data corruption (seen with readahead)
    that resulted from bio payloads that grew too large. This problem
    did not occur with the other raid levels because it either did not
    apply without striping (raid1) or was avoided via stripe caching.
    - raise version to 1.7.0 because of the raid0 API change

    Signed-off-by: Heinz Mauelshagen
    Reviewed-by: Jonathan Brassow
    Signed-off-by: Mike Snitzer

    Heinz Mauelshagen
     
  • Remove comment above parse_raid_params() that claims
    "devices_handle_discard_safely" is a table line argument when it is
    actually is a module parameter.

    Also, backfill dm-raid target version 1.6.0 documentation.

    Signed-off-by: Heinz Mauelshagen
    Reviewed-by: Jonathan Brassow
    Signed-off-by: Mike Snitzer

    Heinz Mauelshagen
     

16 Apr, 2015

4 commits

  • Cryptsetup home page moved to GitLab.
    Also remove link to abandonded Truecrypt page.

    Signed-off-by: Milan Broz
    Signed-off-by: Mike Snitzer

    Milan Broz
     
  • Introduce a new target that is meant for file system developers to test file
    system integrity at particular points in the life of a file system. We capture
    all write requests and associated data and log them to a separate device
    for later replay. There is a userspace utility to do this replay. The
    idea behind this is to give file system developers a tool to verify that
    the file system is always consistent.

    Signed-off-by: Josef Bacik
    Reviewed-by: Zach Brown
    Signed-off-by: Mike Snitzer

    Josef Bacik
     
  • Add device specific modes to dm-verity to specify how corrupted
    blocks should be handled. The following modes are defined:

    - DM_VERITY_MODE_EIO is the default behavior, where reading a
    corrupted block results in -EIO.

    - DM_VERITY_MODE_LOGGING only logs corrupted blocks, but does
    not block the read.

    - DM_VERITY_MODE_RESTART calls kernel_restart when a corrupted
    block is discovered.

    In addition, each mode sends a uevent to notify userspace of
    corruption and to allow further recovery actions.

    The driver defaults to previous behavior (DM_VERITY_MODE_EIO)
    and other modes can be enabled with an additional parameter to
    the verity table.

    Signed-off-by: Sami Tolvanen
    Signed-off-by: Mike Snitzer

    Sami Tolvanen
     
  • The 'trim' message wasn't ever implemented.

    Signed-off-by: Mike Snitzer

    Mike Snitzer
     

01 Apr, 2015

1 commit


17 Feb, 2015

2 commits

  • Make it possible to disable offloading writes by setting the optional
    'submit_from_crypt_cpus' table argument.

    There are some situations where offloading write bios from the
    encryption threads to a single thread degrades performance
    significantly.

    The default is to offload write bios to the same thread because it
    benefits CFQ to have writes submitted using the same IO context.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Mike Snitzer

    Mikulas Patocka
     
  • Use unbound workqueue by default so that work is automatically balanced
    between available CPUs. The original behavior of encrypting using the
    same cpu that IO was submitted on can still be enabled by setting the
    optional 'same_cpu_crypt' table argument.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Mike Snitzer

    Mikulas Patocka
     

11 Nov, 2014

2 commits

  • Before, if the user wanted sequential IO to be promoted to the cache
    they'd have to set sequential_threshold to some nebulous large value.

    Now, the user may easily disable sequential IO detection (and sequential
    IO's implicit bypass of the cache) by setting sequential_threshold to 0.

    Signed-off-by: Mike Snitzer

    Mike Snitzer
     
  • Rather than maintaining a separate promote_threshold variable that we
    periodically update we now use the hit count of the oldest clean
    block. Also add a fudge factor to discourage demoting dirty blocks.

    With some tests this has a sizeable difference, because the old code
    was too eager to demote blocks. For example, device-mapper-test-suite's
    git_extract_cache_quick test goes from taking 190 seconds, to 142
    (linear on spindle takes 250).

    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer

    Joe Thornber
     

02 Aug, 2014

1 commit

  • Add support for quickly loading a repetitive pattern into the
    dm-switch target.

    In the "set_regions_mappings" message, the user may now use "Rn,m" as
    one of the arguments. "n" and "m" are hexadecimal numbers. The "Rn,m"
    argument repeats the last "n" arguments in the following "m" slots.

    For example:
    dmsetup message switch 0 set_region_mappings 1000:1 :2 R2,10
    is equivalent to
    dmsetup message switch 0 set_region_mappings 1000:1 :2 :1 :2 :1 :2 :1 :2 \
    :1 :2 :1 :2 :1 :2 :1 :2 :1 :2

    Requested-by: Jay Wang
    Tested-by: Jay Wang
    Signed-off-by: Mikulas Patocka
    Signed-off-by: Mike Snitzer

    Mikulas Patocka
     

21 May, 2014

1 commit

  • Commit 85ad643b ("dm thin: add timeout to stop out-of-data-space mode
    holding IO forever") introduced a fixed 60 second timeout. Users may
    want to either disable or modify this timeout.

    Allow the out-of-data-space timeout to be configured using the
    'no_space_timeout' dm-thin-pool module param. Setting it to 0 will
    disable the timeout, resulting in IO being queued until more data space
    is added to the thin-pool.

    Signed-off-by: Mike Snitzer
    Cc: stable@vger.kernel.org # 3.14+

    Mike Snitzer
     

28 Mar, 2014

1 commit

  • dm-era is a target that behaves similar to the linear target. In
    addition it keeps track of which blocks were written within a user
    defined period of time called an 'era'. Each era target instance
    maintains the current era as a monotonically increasing 32-bit
    counter.

    Use cases include tracking changed blocks for backup software, and
    partially invalidating the contents of a cache to restore cache
    coherency after rolling back a vendor snapshot.

    dm-era is primarily expected to be paired with the dm-cache target.

    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer

    Joe Thornber
     

07 Mar, 2014

1 commit


06 Mar, 2014

1 commit

  • If a thin metadata operation fails the current transaction will abort,
    whereby causing potential for IO layers up the stack (e.g. filesystems)
    to have data loss. As such, set THIN_METADATA_NEEDS_CHECK_FLAG in the
    thin metadata's superblock which:
    1) requires the user verify the thin metadata is consistent (e.g. use
    thin_check, etc)
    2) suggests the user verify the thin data is consistent (e.g. use fsck)

    The only way to clear the superblock's THIN_METADATA_NEEDS_CHECK_FLAG is
    to run thin_repair.

    On metadata operation failure: abort current metadata transaction, set
    pool in read-only mode, and now set the needs_check flag.

    As part of this change, constraints are introduced or relaxed:
    * don't allow a pool to transition to write mode if needs_check is set
    * don't allow data or metadata space to be resized if needs_check is set
    * if a thin pool's metadata space is exhausted: the kernel will now
    force the user to take the pool offline for repair before the kernel
    will allow the metadata space to be extended.

    Also, update Documentation to include information about when the thin
    provisioning target commits metadata, how it handles metadata failures
    and running out of space.

    Signed-off-by: Mike Snitzer
    Signed-off-by: Joe Thornber

    Mike Snitzer
     

17 Jan, 2014

1 commit

  • The cache's policy may have been established using the "default" alias,
    which is currently the "mq" policy but the default policy may change in
    the future. It is useful to know exactly which policy is being used.

    Add a 'real' member to the dm_cache_policy_type structure and have the
    "default" dm_cache_policy_type point to the real "mq"
    dm_cache_policy_type. Update dm_cache_policy_get_name() to check if
    real is set, if so report the name of the real policy (not the alias).

    Requested-by: Jonathan Brassow
    Signed-off-by: Mike Snitzer

    Mike Snitzer
     

10 Jan, 2014

1 commit

  • Improve cache_status to emit:
    /
    /
    ...

    Adding the block sizes allows for easier calculation of the overall size
    of both the metadata and cache devices. Adding
    provides useful context for how much of the cache is used.

    Unfortunately these additions to the status will require updates to
    users' scripts that monitor the cache status. But these changes help
    provide more comprehensive information about the cache device and will
    simplify tools that are being developed to manage dm-cache devices --
    because they won't need to issue 3 operations to cobble together the
    information that we can easily provide via a single status ioctl.

    While updating the status documentation in cache.txt spaces were
    tabify'd.

    Requested-by: Jonathan Brassow
    Signed-off-by: Mike Snitzer
    Acked-by: Joe Thornber

    Mike Snitzer
     

07 Jan, 2014

2 commits

  • Internally the mq policy maintains a promotion threshold variable. If
    the hit count of a block not in the cache goes above this threshold it
    gets promoted to the cache.

    This patch introduces three new tunables that allow you to tweak the
    promotion threshold by adding a small value. These adjustments depend
    on the io type:

    read_promote_adjustment: READ io, default 4
    write_promote_adjustment: WRITE io, default 8
    discard_promote_adjustment: READ/WRITE io to a discarded block, default 1

    If you're trying to quickly warm a new cache device you may wish to
    reduce these to encourage promotion. Remember to switch them back to
    their defaults after the cache fills though.

    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer

    Joe Thornber
     
  • If the pool runs out of data or metadata space, the pool can either
    queue or error the IO destined to the data device. The default is to
    queue the IO until more space is added.

    An admin may now configure the pool to error IO when no space is
    available by setting the 'error_if_no_space' feature when loading the
    thin-pool table.

    Signed-off-by: Mike Snitzer
    Acked-by: Joe Thornber

    Mike Snitzer