04 Jun, 2014

2 commits


28 Mar, 2014

2 commits

  • Introduce dm_table_run_md_queue_async() to run the request_queue of the
    mapped_device associated with a request-based DM table.

    Also add dm_md_get_queue() wrapper to extract the request_queue from a
    mapped_device.

    Signed-off-by: Mike Snitzer
    Signed-off-by: Hannes Reinecke
    Reviewed-by: Jun'ichi Nomura

    Mike Snitzer
     
  • Remove dm_get_mapinfo() because no target uses it. Targets can allocate
    per-bio data using ti->per_bio_data_size, this is much more flexible
    than union map_info.

    Leave union map_info only for the request-based multipath target's use.
    Also delete the unused "unsigned long long ll" field of union map_info.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Mike Snitzer

    Mikulas Patocka
     

20 Sep, 2013

1 commit

  • Workaround the SCSI layer's problematic WRITE SAME heuristics by
    disabling WRITE SAME in the DM multipath device's queue_limits if an
    underlying device disabled it.

    The WRITE SAME heuristics, with both the original commit 5db44863b6eb
    ("[SCSI] sd: Implement support for WRITE SAME") and the updated commit
    66c28f971 ("[SCSI] sd: Update WRITE SAME heuristics"), default to enabling
    WRITE SAME(10) even without successfully determining it is supported.
    After the first failed WRITE SAME the SCSI layer will disable WRITE SAME
    for the device (by setting sdkp->device->no_write_same which results in
    'max_write_same_sectors' in device's queue_limits to be set to 0).

    When a device is stacked ontop of such a SCSI device any changes to that
    SCSI device's queue_limits do not automatically propagate up the stack.
    As such, a DM multipath device will not have its WRITE SAME support
    disabled. This causes the block layer to continue to issue WRITE SAME
    requests to the mpath device which causes paths to fail and (if mpath IO
    isn't configured to queue when no paths are available) it will result in
    actual IO errors to the upper layers.

    This fix doesn't help configurations that have additional devices
    stacked ontop of the mpath device (e.g. LVM created linear DM devices
    ontop). A proper fix that restacks all the queue_limits from the bottom
    of the device stack up will need to be explored if SCSI will continue to
    use this model of optimistically allowing op codes and then disabling
    them after they fail for the first time.

    Before this patch:

    EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null)
    device-mapper: multipath: XXX snitm debugging: got -EREMOTEIO (-121)
    device-mapper: multipath: XXX snitm debugging: failing WRITE SAME IO with error=-121
    end_request: critical target error, dev dm-6, sector 528
    dm-6: WRITE SAME failed. Manually zeroing.
    device-mapper: multipath: Failing path 8:112.
    end_request: I/O error, dev dm-6, sector 4616
    dm-6: WRITE SAME failed. Manually zeroing.
    end_request: I/O error, dev dm-6, sector 4616
    end_request: I/O error, dev dm-6, sector 5640
    end_request: I/O error, dev dm-6, sector 6664
    end_request: I/O error, dev dm-6, sector 7688
    end_request: I/O error, dev dm-6, sector 524288
    Buffer I/O error on device dm-6, logical block 65536
    lost page write due to I/O error on dm-6
    JBD2: Error -5 detected when updating journal superblock for dm-6-8.
    end_request: I/O error, dev dm-6, sector 524296
    Aborting journal on device dm-6-8.
    end_request: I/O error, dev dm-6, sector 524288
    Buffer I/O error on device dm-6, logical block 65536
    lost page write due to I/O error on dm-6
    JBD2: Error -5 detected when updating journal superblock for dm-6-8.

    # cat /sys/block/sdh/queue/write_same_max_bytes
    0
    # cat /sys/block/dm-6/queue/write_same_max_bytes
    33553920

    After this patch:

    EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null)
    device-mapper: multipath: XXX snitm debugging: got -EREMOTEIO (-121)
    device-mapper: multipath: XXX snitm debugging: WRITE SAME I/O failed with error=-121
    end_request: critical target error, dev dm-6, sector 528
    dm-6: WRITE SAME failed. Manually zeroing.

    # cat /sys/block/sdh/queue/write_same_max_bytes
    0
    # cat /sys/block/dm-6/queue/write_same_max_bytes
    0

    It should be noted that WRITE SAME support wasn't enabled in DM
    multipath until v3.10.

    Signed-off-by: Mike Snitzer
    Cc: Martin K. Petersen
    Cc: Hannes Reinecke
    Cc: stable@vger.kernel.org # 3.10+

    Mike Snitzer
     

06 Sep, 2013

1 commit

  • Support the collection of I/O statistics on user-defined regions of
    a DM device. If no regions are defined no statistics are collected so
    there isn't any performance impact. Only bio-based DM devices are
    currently supported.

    Each user-defined region specifies a starting sector, length and step.
    Individual statistics will be collected for each step-sized area within
    the range specified.

    The I/O statistics counters for each step-sized area of a region are
    in the same format as /sys/block/*/stat or /proc/diskstats but extra
    counters (12 and 13) are provided: total time spent reading and
    writing in milliseconds. All these counters may be accessed by sending
    the @stats_print message to the appropriate DM device via dmsetup.

    The creation of DM statistics will allocate memory via kmalloc or
    fallback to using vmalloc space. At most, 1/4 of the overall system
    memory may be allocated by DM statistics. The admin can see how much
    memory is used by reading
    /sys/module/dm_mod/parameters/stats_current_allocated_bytes

    See Documentation/device-mapper/statistics.txt for more details.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Mike Snitzer
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     

11 Jul, 2013

1 commit

  • This patch removes "io_lock" and "map_lock" in struct mapped_device and
    "holders" in struct dm_table and replaces these mechanisms with
    sleepable-rcu.

    Previously, the code would call "dm_get_live_table" and "dm_table_put" to
    get and release table. Now, the code is changed to call "dm_get_live_table"
    and "dm_put_live_table". dm_get_live_table locks sleepable-rcu and
    dm_put_live_table unlocks it.

    dm_get_live_table_fast/dm_put_live_table_fast can be used instead of
    dm_get_live_table/dm_put_live_table. These *_fast functions use
    non-sleepable RCU, so the caller must not block between them.

    If the code changes active or inactive dm table, it must call
    dm_sync_table before destroying the old table.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Jun'ichi Nomura
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     

10 May, 2013

1 commit


02 Mar, 2013

3 commits

  • Add a num_write_bios function to struct target.

    If an instance of a target sets this, it will be queried before the
    target's mapping function is called on a write bio, and the response
    controls the number of copies of the write bio that the target will
    receive.

    This provides a convenient way for a target to send the same data to
    more than one device. The new cache target uses this in writethrough
    mode, to send the data both to the cache and the backing device.

    Signed-off-by: Alasdair G Kergon

    Alasdair G Kergon
     
  • Use 'bio' in the name of variables and functions that deal with
    bios rather than 'request' to avoid confusion with the normal
    block layer use of 'request'.

    No functional changes.

    Signed-off-by: Alasdair G Kergon

    Alasdair G Kergon
     
  • Avoid returning a truncated table or status string instead of setting
    the DM_BUFFER_FULL_FLAG when the last target of a table fills the
    buffer.

    When processing a table or status request, the function retrieve_status
    calls ti->type->status. If ti->type->status returns non-zero,
    retrieve_status assumes that the buffer overflowed and sets
    DM_BUFFER_FULL_FLAG.

    However, targets don't return non-zero values from their status method
    on overflow. Most targets returns always zero.

    If a buffer overflow happens in a target that is not the last in the
    table, it gets noticed during the next iteration of the loop in
    retrieve_status; but if a buffer overflow happens in the last target, it
    goes unnoticed and erroneously truncated data is returned.

    In the current code, the targets behave in the following way:
    * dm-crypt returns -ENOMEM if there is not enough space to store the
    key, but it returns 0 on all other overflows.
    * dm-thin returns errors from the status method if a disk error happened.
    This is incorrect because retrieve_status doesn't check the error
    code, it assumes that all non-zero values mean buffer overflow.
    * all the other targets always return 0.

    This patch changes the ti->type->status function to return void (because
    most targets don't use the return code). Overflow is detected in
    retrieve_status: if the status method fills up the remaining space
    completely, it is assumed that buffer overflow happened.

    Cc: stable@vger.kernel.org
    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     

22 Dec, 2012

4 commits

  • This patch removes map_info from bio-based device mapper targets.
    map_info is still used for request-based targets.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • This patch moves target_request_nr from map_info to dm_target_io and
    makes it accessible with dm_bio_get_target_request_nr.

    This patch is a preparation for the next patch that removes map_info.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • Introduce a field per_bio_data_size in struct dm_target.

    Targets can set this field in the constructor. If a target sets this
    field to a non-zero value, "per_bio_data_size" bytes of auxiliary data
    are allocated for each bio submitted to the target. These data can be
    used for any purpose by the target and help us improve performance by
    removing some per-target mempools.

    Per-bio data is accessed with dm_per_bio_data. The
    argument data_size must be the same as the value per_bio_data_size in
    dm_target.

    If the target has a pointer to per_bio_data, it can get a pointer to
    the bio with dm_bio_from_per_bio_data() function (data_size must be the
    same as the value passed to dm_per_bio_data).

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • Allow targets to opt in to WRITE SAME support by setting
    'num_write_same_requests' in the dm_target structure.

    A dm device will only advertise WRITE SAME support if all its
    targets and all its underlying devices support it.

    Signed-off-by: Mike Snitzer
    Signed-off-by: Alasdair G Kergon

    Mike Snitzer
     

27 Jul, 2012

6 commits

  • Commit outstanding metadata before returning the status for a dm thin
    pool so that the numbers reported are as up-to-date as possible.

    The commit is not performed if the device is suspended or if
    the DM_NOFLUSH_FLAG is supplied by userspace and passed to the target
    through a new 'status_flags' parameter in the target's dm_status_fn.

    The userspace dmsetup tool will support the --noflush flag with the
    'dmsetup status' and 'dmsetup wait' commands from version 1.02.76
    onwards.

    Tested-by: Mike Snitzer
    Signed-off-by: Alasdair G Kergon

    Alasdair G Kergon
     
  • Use boolean bit fields for flags in struct dm_target.

    Signed-off-by: Alasdair G Kergon

    Alasdair G Kergon
     
  • Allow targets to override the 'supports flush' calculation.

    Set 'flush_supported' if a target needs to receive flushes regardless of
    whether or not its underlying devices have support.

    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Signed-off-by: Alasdair G Kergon

    Joe Thornber
     
  • This patch introduces a new variable split_discard_requests. It can be
    set by targets so that discard requests are split on max_io_len
    boundaries.

    When split_discard_requests is not set, discard requests are only split on
    boundaries between targets, as was the case before this patch.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • Remove the restriction that limits a target's specified maximum incoming
    I/O size to be a power of 2.

    Rename this setting from 'split_io' to the less-ambiguous 'max_io_len'.
    Change it from sector_t to uint32_t, which is plenty big enough, and
    introduce a wrapper function dm_set_target_max_io_len() to set it.
    Use sector_div() to process it now that it is not necessarily a power of 2.

    Signed-off-by: Mike Snitzer
    Signed-off-by: Alasdair G Kergon

    Mike Snitzer
     
  • Remove unused dm_flush_fn .flush target method from header.

    This was left-over from the FLUSH/FUA conversion and is no longer used.

    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Signed-off-by: Alasdair G Kergon

    Joe Thornber
     

01 Nov, 2011

4 commits

  • Introduce DM_TARGET_IMMUTABLE to indicate that the target type cannot be mixed
    with any other target type, and once loaded into a device, it cannot be
    replaced with a table containing a different type.

    The thin provisioning pool device will use this.

    Signed-off-by: Alasdair G Kergon

    Alasdair G Kergon
     
  • Add a target feature flag DM_TARGET_ALWAYS_WRITEABLE to indicate that a target
    does not support read-only mode.

    The initial implementation of the thin provisioning target uses this.

    Signed-off-by: Alasdair G Kergon

    Alasdair G Kergon
     
  • Introduce the concept of a singleton table which contains exactly one target.

    If a target type sets the DM_TARGET_SINGLETON feature bit device-mapper
    will ensure that any table that includes that target contains no others.

    The thin provisioning pool target uses this.

    Signed-off-by: Alasdair G Kergon

    Alasdair G Kergon
     
  • printk_ratelimit() shares global ratelimiting state with all
    other subsystems, so its usage is discouraged. Instead,
    define and use dm's local state.

    Signed-off-by: Namhyung Kim
    Signed-off-by: Alasdair G Kergon

    Namhyung Kim
     

26 Sep, 2011

1 commit

  • If optional discard support in dm-crypt is enabled, discards requests
    bypass the crypt queue and blocks of the underlying device are discarded.
    For the read path, discarded blocks are handled the same as normal
    ciphertext blocks, thus decrypted.

    So if the underlying device announces discarded regions return zeroes,
    dm-crypt must disable this flag because after decryption there is just
    random noise instead of zeroes.

    Signed-off-by: Milan Broz
    Signed-off-by: Alasdair G Kergon

    Milan Broz
     

02 Aug, 2011

1 commit


29 May, 2011

1 commit


18 Apr, 2011

1 commit


10 Mar, 2011

1 commit

  • Code has been converted over to the new explicit on-stack plugging,
    and delay users have been converted to use the new API for that.
    So lets kill off the old plugging along with aops->sync_page().

    Signed-off-by: Jens Axboe

    Jens Axboe
     

14 Jan, 2011

2 commits

  • Add per-target unplug callback support.

    Cc: linux-raid@vger.kernel.org
    Signed-off-by: NeilBrown
    Signed-off-by: Jonathan Brassow
    Signed-off-by: Mike Snitzer
    Signed-off-by: Alasdair G Kergon

    NeilBrown
     
  • DM currently implements congestion checking by checking on congestion
    in each component device. For raid456 we need to also check if the
    stripe cache is congested.

    Add per-target congestion checker callback support.

    Extending the target_callbacks structure with additional callback
    functions allows for establishing multiple callbacks per-target (a
    callback is also needed for unplug).

    Cc: linux-raid@vger.kernel.org
    Signed-off-by: NeilBrown
    Signed-off-by: Jonathan Brassow
    Signed-off-by: Mike Snitzer
    Signed-off-by: Alasdair G Kergon

    NeilBrown
     

12 Aug, 2010

3 commits

  • Split max_io_len_target_boundary out of max_io_len so that the discard
    support can make use of it without duplicating max_io_len code.

    Avoiding max_io_len's split_io logic enables DM's discard support to
    submit the entire discard request to a target. But discards must still
    be split on target boundaries.

    Signed-off-by: Mike Snitzer
    Signed-off-by: Alasdair G Kergon

    Mike Snitzer
     
  • Allow discards to be passed through to linear mappings if at least one
    underlying device supports it. Discards will be forwarded only to
    devices that support them.

    A target that supports discards should set num_discard_requests to
    indicate how many times each discard request must be submitted to it.

    Verify table's underlying devices support discards prior to setting the
    associated DM device as capable of discards (via QUEUE_FLAG_DISCARD).

    Signed-off-by: Mike Snitzer
    Signed-off-by: Mikulas Patocka
    Reviewed-by: Joe Thornber
    Signed-off-by: Alasdair G Kergon

    Mike Snitzer
     
  • 'target_request_nr' is a more generic name that reflects the fact that
    it will be used for both flush and discard support.

    Signed-off-by: Mike Snitzer
    Signed-off-by: Alasdair G Kergon

    Mike Snitzer
     

06 Mar, 2010

1 commit


11 Dec, 2009

4 commits