11 Nov, 2017

1 commit

  • Using the ARRAY_SIZE macro improves the readability of the code.

    Found with Coccinelle with the following semantic patch:
    @r depends on (org || report)@
    type T;
    T[] E;
    position p;
    @@
    (
    (sizeof(E)@p /sizeof(*E))
    |
    (sizeof(E)@p /sizeof(E[...]))
    |
    (sizeof(E)@p /sizeof(T))
    )

    Signed-off-by: Jérémy Lefaure
    Signed-off-by: Mike Snitzer

    Jérémy Lefaure
     

17 Feb, 2017

1 commit

  • Declare dm_space_map structures as const as they are only passed as an
    argument to the function memcpy. This argument is of type const void *,
    so dm_space_map structures having this property can be declared as
    const.

    File size before:
    text data bss dec hex filename
    4889 240 0 5129 1409 dm-space-map-metadata.o

    File size after:
    text data bss dec hex filename
    5139 0 0 5139 1413 dm-space-map-metadata.o

    Signed-off-by: Bhumika Goyal
    Signed-off-by: Mike Snitzer

    Bhumika Goyal
     

09 Dec, 2016

1 commit

  • In dm_sm_metadata_create() we temporarily change the dm_space_map
    operations from 'ops' (whose .destroy function deallocates the
    sm_metadata) to 'bootstrap_ops' (whose .destroy function doesn't).

    If dm_sm_metadata_create() fails in sm_ll_new_metadata() or
    sm_ll_extend(), it exits back to dm_tm_create_internal(), which calls
    dm_sm_destroy() with the intention of freeing the sm_metadata, but it
    doesn't (because the dm_space_map operations is still set to
    'bootstrap_ops').

    Fix this by setting the dm_space_map operations back to 'ops' if
    dm_sm_metadata_create() fails when it is set to 'bootstrap_ops'.

    Signed-off-by: Benjamin Marzinski
    Acked-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Cc: stable@vger.kernel.org

    Benjamin Marzinski
     

14 Dec, 2015

1 commit

  • Remove the unused struct block_op pointer that was inadvertantly
    introduced, via cut-and-paste of previous brb_op() code, as part of
    commit 50dd842ad.

    (Cc'ing stable@ because commit 50dd842ad did)

    Fixes: 50dd842ad ("dm space map metadata: fix ref counting bug when bootstrapping a new space map")
    Reported-by: David Binderman
    Signed-off-by: Mike Snitzer
    Cc: stable@vger.kernel.org

    Mike Snitzer
     

10 Dec, 2015

1 commit

  • When applying block operations (BOPs) do not remove them from the
    uncommitted BOP ring-buffer until after they've been applied -- in case
    we recurse.

    Also, perform BOP_INC operation, in dm_sm_metadata_create() and
    sm_metadata_extend(), in terms of the uncommitted BOP ring-buffer rather
    than using direct calls to sm_ll_inc().

    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Cc: stable@vger.kernel.org

    Joe Thornber
     

17 Jun, 2015

1 commit

  • The metadata space map has a simplified 'bootstrap' mode that is
    operational when extending the space maps. Whilst in this mode it's
    possible for some refcount decrement operations to become queued (eg, as
    a result of shadowing one of the bitmap indexes). These decrements were
    not being applied when switching out of bootstrap mode.

    The effect of this bug was the leaking of a 4k metadata block. This is
    detected by the latest version of thin_check as a non fatal error.

    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Cc: stable@vger.kernel.org

    Joe Thornber
     

02 Dec, 2014

2 commits


08 Mar, 2014

1 commit

  • This has been a relatively long-standing issue that wasn't nailed down
    until Teng-Feng Yang's meticulous bug report to dm-devel on 3/7/2014,
    see: http://www.redhat.com/archives/dm-devel/2014-March/msg00021.html

    From that report:
    "When decreasing the reference count of a metadata block with its
    reference count equals 3, we will call dm_btree_remove() to remove
    this enrty from the B+tree which keeps the reference count info in
    metadata device.

    The B+tree will try to rebalance the entry of the child nodes in each
    node it traversed, and the rebalance process contains the following
    steps.

    (1) Finding the corresponding children in current node (shadow_current(s))
    (2) Shadow the children block (issue BOP_INC)
    (3) redistribute keys among children, and free children if necessary (issue BOP_DEC)

    Since the update of a metadata block's reference count could be
    recursive, we will stash these reference count update operations in
    smm->uncommitted and then process them in a FILO fashion.

    The problem is that step(3) could free the children which is created
    in step(2), so the BOP_DEC issued in step(3) will be carried out
    before the BOP_INC issued in step(2) since these BOPs will be
    processed in FILO fashion. Once the BOP_DEC from step(3) tries to
    decrease the reference count of newly shadow block, it will report
    failure for its reference equals 0 before decreasing. It looks like we
    can solve this issue by processing these BOPs in a FIFO fashion
    instead of FILO."

    Commit 5b564d80 ("dm space map: disallow decrementing a reference count
    below zero") changed the code to report an error for this temporary
    refcount decrement below zero. So what was previously a harmless
    invalid refcount became a hard failure due to the new error path:

    device-mapper: space map common: unable to decrement a reference count below 0
    device-mapper: thin: 253:6: dm_thin_insert_block() failed: error = -22
    device-mapper: thin: 253:6: switching pool to read-only mode

    This bug is in dm persistent-data code that is common to the DM thin and
    cache targets. So any users of those targets should apply this fix.

    Fix this by applying recursive space map operations in FIFO order rather
    than FILO.

    Resolves: https://bugzilla.kernel.org/show_bug.cgi?id=68801

    Reported-by: Apollon Oikonomopoulos
    Reported-by: edwillam1007@gmail.com
    Reported-by: Teng-Feng Yang
    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Cc: stable@vger.kernel.org # 3.13+

    Joe Thornber
     

28 Feb, 2014

1 commit

  • It was always intended that a user could provide a thin metadata device
    that is larger than the max supported by the on-disk format. The extra
    space would just go unused.

    Unfortunately that never worked. If the user attempted to use a larger
    metadata device on creation they would get an error like the following:

    device-mapper: space map common: space map too large
    device-mapper: transaction manager: couldn't create metadata space map
    device-mapper: thin metadata: tm_create_with_sm failed
    device-mapper: table: 252:17: thin-pool: Error creating metadata object
    device-mapper: ioctl: error adding target to table

    Fix this by allowing the initial metadata space map creation to cap its
    size at the max number of blocks supported (DM_SM_METADATA_MAX_BLOCKS).
    get_metadata_dev_size() must also impose DM_SM_METADATA_MAX_BLOCKS (via
    THIN_METADATA_MAX_SECTORS), otherwise extending metadata would cap at
    THIN_METADATA_MAX_SECTORS_WARNING (which is larger than supported).

    Also, the calculation for THIN_METADATA_MAX_SECTORS didn't account for
    the sizeof the disk_bitmap_header. So the supported maximum metadata
    size is a bit smaller (reduced from 33423360 to 33292800 sectors).

    Lastly, remove the "excess space will not be used" warning message from
    get_metadata_dev_size(); it resulted in printing the warning multiple
    times. Factor out warn_if_metadata_device_too_big(), call it from
    pool_ctr() and maybe_resize_metadata_dev().

    Signed-off-by: Mike Snitzer
    Acked-by: Joe Thornber

    Mike Snitzer
     

22 Jan, 2014

1 commit

  • This bug was introduced in commit 7e664b3dec431e ("dm space map metadata:
    fix extending the space map").

    When extending a dm-thin metadata volume we:

    - Switch the space map into a simple bootstrap mode, which allocates
    all space linearly from the newly added space.
    - Add new bitmap entries for the new space
    - Increment the reference counts for those newly allocated bitmap
    entries
    - Commit changes to disk
    - Switch back out of bootstrap mode.

    But, the disk commit may allocate space itself, if so this fact will be
    lost when switching out of bootstrap mode.

    The bug exhibited itself as an error when the bitmap_root, with an
    erroneous ref count of 0, was subsequently decremented as part of a
    later disk commit. This would cause the disk commit to fail, and thinp
    to enter read_only mode. The metadata was not damaged (thin_check
    passed).

    The fix is to put the increments + commit into a loop, running until
    the commit has not allocated extra space. In practise this loop only
    runs twice.

    With this fix the following device mapper testsuite test passes:
    dmtest run --suite thin-provisioning -n thin_remove_works_after_resize

    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Cc: stable@vger.kernel.org # depends on commit 7e664b3dec431e

    Joe Thornber
     

08 Jan, 2014

1 commit

  • When extending a metadata space map we should do the first commit whilst
    still in bootstrap mode -- a mode where all blocks get allocated in the
    new area.

    That way the commit overhead is allocated from the newly added space.
    Otherwise we risk running out of space.

    With this fix, and the previous commit "dm space map common: make sure
    new space is used during extend", the following device mapper testsuite
    test passes:
    dmtest run --suite thin-provisioning -n /resize_metadata_no_io/

    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Cc: stable@vger.kernel.org

    Joe Thornber
     

07 Jan, 2014

1 commit


11 Dec, 2013

1 commit

  • Commit 2fc48021f4afdd109b9e52b6eef5db89ca80bac7 ("dm persistent
    metadata: add space map threshold callback") introduced a regression
    to the metadata block allocation path that resulted in errors being
    ignored. This regression was uncovered by running the following
    device-mapper-test-suite test:
    dmtest run --suite thin-provisioning -n /exhausting_metadata_space_causes_fail_mode/

    The ignored error codes in sm_metadata_new_block() could crash the
    kernel through use of either the dm-thin or dm-cache targets, e.g.:

    device-mapper: thin: 253:4: reached low water mark for metadata device: sending event.
    device-mapper: space map metadata: unable to allocate new metadata block
    general protection fault: 0000 [#1] SMP
    ...
    Workqueue: dm-thin do_worker [dm_thin_pool]
    task: ffff880035ce2ab0 ti: ffff88021a054000 task.ti: ffff88021a054000
    RIP: 0010:[] [] metadata_ll_load_ie+0x15/0x30 [dm_persistent_data]
    RSP: 0018:ffff88021a055a68 EFLAGS: 00010202
    RAX: 003fc8243d212ba0 RBX: ffff88021a780070 RCX: ffff88021a055a78
    RDX: ffff88021a055a78 RSI: 0040402222a92a80 RDI: ffff88021a780070
    RBP: ffff88021a055a68 R08: ffff88021a055ba4 R09: 0000000000000010
    R10: 0000000000000000 R11: 00000002a02e1000 R12: ffff88021a055ad4
    R13: 0000000000000598 R14: ffffffffa0338470 R15: ffff88021a055ba4
    FS: 0000000000000000(0000) GS:ffff88033fca0000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 00007f467c0291b8 CR3: 0000000001a0b000 CR4: 00000000000007e0
    Stack:
    ffff88021a055ab8 ffffffffa0332020 ffff88021a055b30 0000000000000001
    ffff88021a055b30 0000000000000000 ffff88021a055b18 0000000000000000
    ffff88021a055ba4 ffff88021a055b98 ffff88021a055ae8 ffffffffa033304c
    Call Trace:
    [] sm_ll_lookup_bitmap+0x40/0xa0 [dm_persistent_data]
    [] sm_metadata_count_is_more_than_one+0x8c/0xc0 [dm_persistent_data]
    [] dm_tm_shadow_block+0x65/0x110 [dm_persistent_data]
    [] sm_ll_mutate+0x80/0x300 [dm_persistent_data]
    [] ? set_ref_count+0x10/0x10 [dm_persistent_data]
    [] sm_ll_inc+0x1a/0x20 [dm_persistent_data]
    [] sm_disk_new_block+0x60/0x80 [dm_persistent_data]
    [] ? down_write+0x16/0x40
    [] dm_pool_alloc_data_block+0x54/0x80 [dm_thin_pool]
    [] alloc_data_block+0x9c/0x130 [dm_thin_pool]
    [] provision_block+0x4e/0x180 [dm_thin_pool]
    [] ? dm_thin_find_block+0x6a/0x110 [dm_thin_pool]
    [] process_bio+0x1ca/0x1f0 [dm_thin_pool]
    [] ? mempool_free+0x8d/0xa0
    [] process_deferred_bios+0xc5/0x230 [dm_thin_pool]
    [] do_worker+0x51/0x60 [dm_thin_pool]
    [] process_one_work+0x182/0x3b0
    [] worker_thread+0x120/0x3a0
    [] ? manage_workers+0x160/0x160
    [] kthread+0xce/0xe0
    [] ? kthread_freezable_should_stop+0x70/0x70
    [] ret_from_fork+0x7c/0xb0
    [] ? kthread_freezable_should_stop+0x70/0x70
    [] ret_from_fork+0x7c/0xb0
    [] ? kthread_freezable_should_stop+0x70/0x70

    Signed-off-by: Mike Snitzer
    Acked-by: Joe Thornber
    Cc: stable@vger.kernel.org # v3.10+

    Mike Snitzer
     

10 May, 2013

4 commits


22 Dec, 2012

1 commit


01 Nov, 2011

1 commit

  • The persistent-data library offers a re-usable framework for the storage
    and management of on-disk metadata in device-mapper targets.

    It's used by the thin-provisioning target in the next patch and in an
    upcoming hierarchical storage target.

    For further information, please read
    Documentation/device-mapper/persistent-data.txt

    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Signed-off-by: Alasdair G Kergon

    Joe Thornber