11 Jun, 2016

1 commit

  • Add some seperation between bio-based and request-based DM core code.

    'struct mapped_device' and other DM core only structures and functions
    have been moved to dm-core.h and all relevant DM core .c files have been
    updated to include dm-core.h rather than dm.h

    DM targets should _never_ include dm-core.h!

    [block core merge conflict resolution from Stephen Rothwell]
    Signed-off-by: Mike Snitzer
    Signed-off-by: Stephen Rothwell

    Mike Snitzer
     

16 Apr, 2015

3 commits

  • Request-based DM's blk-mq support defaults to off; but a user can easily
    change the default using the dm_mod.use_blk_mq module/boot option.

    Also, you can check what mode a given request-based DM device is using
    with: cat /sys/block/dm-X/dm/use_blk_mq

    This change enabled further cleanup and reduced work (e.g. the
    md->io_pool and md->rq_pool isn't created if using blk-mq).

    Signed-off-by: Mike Snitzer

    Mike Snitzer
     
  • Otherwise, for sequential workloads, the dm_request_fn can allow
    excessive request merging at the expense of increased service time.

    Add a per-device sysfs attribute to allow the user to control how long a
    request, that is a reasonable merge candidate, can be queued on the
    request queue. The resolution of this request dispatch deadline is in
    microseconds (ranging from 1 to 100000 usecs), to set a 20us deadline:
    echo 20 > /sys/block/dm-7/dm/rq_based_seq_io_merge_deadline

    The dm_request_fn's merge heuristic and associated extra accounting is
    disabled by default (rq_based_seq_io_merge_deadline is 0).

    This sysfs attribute is not applicable to bio-based DM devices so it
    will only ever report 0 for them.

    By allowing a request to remain on the queue it will block others
    requests on the queue. But introducing a short dequeue delay has proven
    very effective at enabling certain sequential IO workloads on really
    fast, yet IOPS constrained, devices to build up slightly larger IOs --
    yielding 90+% throughput improvements. Having precise control over the
    time taken to wait for larger requests to build affords control beyond
    that of waiting for certain IO sizes to accumulate (which would require
    a deadline anyway). This knob will only ever make sense with sequential
    IO workloads and the particular value used is storage configuration
    specific.

    Given the expected niche use-case for when this knob is useful it has
    been deemed acceptable to expose this relatively crude method for
    crafting optimal IO on specific storage -- especially given the solution
    is simple yet effective. In the context of DM multipath, it is
    advisable to tune this sysfs attribute to a value that offers the best
    performance for the common case (e.g. if 4 paths are expected active,
    tune for that; if paths fail then performance may be slightly reduced).

    Alternatives were explored to have request-based DM autotune this value
    (e.g. if/when paths fail) but they were quickly deemed too fragile and
    complex to warrant further design and development time. If this problem
    proves more common as faster storage emerges we'll have to look at
    elevating a generic solution into the block core.

    Tested-by: Shiva Krishna Merla
    Signed-off-by: Mike Snitzer

    Mike Snitzer
     
  • Add DM_ATTR_RW() macro and establish .store method in dm_sysfs_ops.

    Signed-off-by: Mike Snitzer

    Mike Snitzer
     

15 Jan, 2014

1 commit

  • This reverts commit be35f48610 ("dm: wait until embedded kobject is
    released before destroying a device") and provides an improved fix.

    The kobject release code that calls the completion must be placed in a
    non-module file, otherwise there is a module unload race (if the process
    calling dm_kobject_release is preempted and the DM module unloaded after
    the completion is triggered, but before dm_kobject_release returns).

    To fix this race, this patch moves the completion code to dm-builtin.c
    which is always compiled directly into the kernel if BLK_DEV_DM is
    selected.

    The patch introduces a new dm_kobject_holder structure, its purpose is
    to keep the completion and kobject in one place, so that it can be
    accessed from non-module code without the need to export the layout of
    struct mapped_device to that code.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Mike Snitzer
    Cc: stable@vger.kernel.org

    Mikulas Patocka
     

08 Jan, 2014

1 commit

  • There may be other parts of the kernel holding a reference on the dm
    kobject. We must wait until all references are dropped before
    deallocating the mapped_device structure.

    The dm_kobject_release method signals that all references are dropped
    via completion. But dm_kobject_release doesn't free the kobject (which
    is embedded in the mapped_device structure).

    This is the sequence of operations:
    * when destroying a DM device, call kobject_put from dm_sysfs_exit
    * wait until all users stop using the kobject, when it happens the
    release method is called
    * the release method signals the completion and should return without
    delay
    * the dm device removal code that waits on the completion continues
    * the dm device removal code drops the dm_mod reference the device had
    * the dm device removal code frees the mapped_device structure that
    contains the kobject

    Using kobject this way should avoid the module unload race that was
    mentioned at the beginning of this thread:
    https://lkml.org/lkml/2014/1/4/83

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Mike Snitzer
    Cc: stable@vger.kernel.org

    Mikulas Patocka
     

08 Mar, 2010

1 commit

  • Constify struct sysfs_ops.

    This is part of the ops structure constification
    effort started by Arjan van de Ven et al.

    Benefits of this constification:

    * prevents modification of data that is shared
    (referenced) by many other structure instances
    at runtime

    * detects/prevents accidental (but not intentional)
    modification attempts on archs that enforce
    read-only kernel data at runtime

    * potentially better optimized code as the compiler
    can assume that the const data cannot be changed

    * the compiler/linker move const data into .rodata
    and therefore exclude them from false sharing

    Signed-off-by: Emese Revfy
    Acked-by: David Teigland
    Acked-by: Matt Domsch
    Acked-by: Maciej Sosnowski
    Acked-by: Hans J. Koch
    Acked-by: Pekka Enberg
    Acked-by: Jens Axboe
    Acked-by: Stephen Hemminger
    Signed-off-by: Greg Kroah-Hartman

    Emese Revfy
     

17 Feb, 2010

1 commit

  • Revert commit d2bb7df8cac647b92f51fb84ae735771e7adbfa7 at Greg's request.

    Author: Milan Broz
    Date: Thu Dec 10 23:51:53 2009 +0000

    dm: sysfs add empty release function to avoid debug warning

    This patch just removes an unnecessary warning:
    kobject: 'dm': does not have a release() function,
    it is broken and must be fixed.

    The kobject is embedded in mapped device struct, so
    code does not need to release memory explicitly here.

    Cc: Greg KH
    Signed-off-by: Alasdair G Kergon

    Alasdair G Kergon
     

11 Dec, 2009

2 commits


22 Jun, 2009

1 commit


06 Jan, 2009

1 commit

  • Implement simple read-only sysfs entry for device-mapper block device.

    This patch adds a simple sysfs directory named "dm" under block device
    properties and implements
    - name attribute (string containing mapped device name)
    - uuid attribute (string containing UUID, or empty string if not set)

    The kobject is embedded in mapped_device struct, so no additional
    memory allocation is needed for initializing sysfs entry.

    During the processing of sysfs attribute we need to lock mapped device
    which is done by a new function dm_get_from_kobj, which returns the md
    associated with kobject and increases the usage count.

    Each 'show attribute' function is responsible for its own locking.

    Signed-off-by: Milan Broz
    Signed-off-by: Alasdair G Kergon

    Milan Broz