28 Jan, 2021

1 commit

  • This reverts commit cd74693870fb748d812867ba49af733d689a3604.

    This is a workaround for allowing dm-crypt crypto operations to be
    offloaded to caam crypto accelerator.
    It's needed because crypto algorithms registered by caam are marked with
    CRYPTO_ALG_ALLOCATES_MEMORY flag.

    Background:
    Red Hat reported possible allocation issues in dm-crypt, dm-integrity:
    Link: https://lore.kernel.org/linux-crypto/alpine.LRH.2.02.2006091259250.30590@file01.intranet.prod.int.rdu2.redhat.com
    The solution found was a mechanism to let dm-crypt, dm-integrity avoid
    using crypto algorithms that allocate memory "at runtime" - by specifying
    the CRYPTO_ALG_ALLOCATES_MEMORY flag introduced in
    commit fbb6cda44190 ("crypto: algapi - introduce the flag CRYPTO_ALG_ALLOCATES_MEMORY")

    Signed-off-by: Horia Geantă
    Reviewed-by: Manish Tomar

    Horia Geantă
     

20 Jan, 2021

10 commits

  • commit 0378c625afe80eb3f212adae42cc33c9f6f31abf upstream.

    There wasn't ever a real need to log an error in the kernel log for
    ioctls issued with insufficient permissions. Simply return an error
    and if an admin/user is sufficiently motivated they can enable DM's
    dynamic debugging to see an explanation for why the ioctls were
    disallowed.

    Reported-by: Nir Soffer
    Fixes: e980f62353c6 ("dm: don't allow ioctls to targets that don't map to whole devices")
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Mike Snitzer
     
  • commit b690bd546b227c32b860dae985a18bed8aa946fe upstream.

    Without crc32 support, this driver fails to link:

    arm-linux-gnueabi-ld: drivers/md/dm-zoned-metadata.o: in function `dmz_write_sb':
    dm-zoned-metadata.c:(.text+0xe98): undefined reference to `crc32_le'
    arm-linux-gnueabi-ld: drivers/md/dm-zoned-metadata.o: in function `dmz_check_sb':
    dm-zoned-metadata.c:(.text+0x7978): undefined reference to `crc32_le'

    Fixes: 3b1a94c88b79 ("dm zoned: drive-managed zoned block device target")
    Signed-off-by: Arnd Bergmann
    Reviewed-by: Damien Le Moal
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Arnd Bergmann
     
  • commit c87a95dc28b1431c7e77e2c0c983cf37698089d2 upstream.

    On some specific hardware on early boot we occasionally get:

    [ 1193.920255][ T0] BUG: sleeping function called from invalid context at mm/mempool.c:381
    [ 1193.936616][ T0] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/69
    [ 1193.953233][ T0] no locks held by swapper/69/0.
    [ 1193.965871][ T0] irq event stamp: 575062
    [ 1193.977724][ T0] hardirqs last enabled at (575061): [] tick_nohz_idle_exit+0xe2/0x3e0
    [ 1194.002762][ T0] hardirqs last disabled at (575062): [] flush_smp_call_function_from_idle+0x4f/0x80
    [ 1194.029035][ T0] softirqs last enabled at (575050): [] asm_call_irq_on_stack+0x12/0x20
    [ 1194.054227][ T0] softirqs last disabled at (575043): [] asm_call_irq_on_stack+0x12/0x20
    [ 1194.079389][ T0] CPU: 69 PID: 0 Comm: swapper/69 Not tainted 5.10.6-cloudflare-kasan-2021.1.4-dev #1
    [ 1194.104103][ T0] Hardware name: NULL R162-Z12-CD/MZ12-HD4-CD, BIOS R10 06/04/2020
    [ 1194.119591][ T0] Call Trace:
    [ 1194.130233][ T0] dump_stack+0x9a/0xcc
    [ 1194.141617][ T0] ___might_sleep.cold+0x180/0x1b0
    [ 1194.153825][ T0] mempool_alloc+0x16b/0x300
    [ 1194.165313][ T0] ? remove_element+0x160/0x160
    [ 1194.176961][ T0] ? blk_mq_end_request+0x4b/0x490
    [ 1194.188778][ T0] crypt_convert+0x27f6/0x45f0 [dm_crypt]
    [ 1194.201024][ T0] ? rcu_read_lock_sched_held+0x3f/0x70
    [ 1194.212906][ T0] ? module_assert_mutex_or_preempt+0x3e/0x70
    [ 1194.225318][ T0] ? __module_address.part.0+0x1b/0x3a0
    [ 1194.237212][ T0] ? is_kernel_percpu_address+0x5b/0x190
    [ 1194.249238][ T0] ? crypt_iv_tcw_ctr+0x4a0/0x4a0 [dm_crypt]
    [ 1194.261593][ T0] ? is_module_address+0x25/0x40
    [ 1194.272905][ T0] ? static_obj+0x8a/0xc0
    [ 1194.283582][ T0] ? lockdep_init_map_waits+0x26a/0x700
    [ 1194.295570][ T0] ? __raw_spin_lock_init+0x39/0x110
    [ 1194.307330][ T0] kcryptd_crypt_read_convert+0x31c/0x560 [dm_crypt]
    [ 1194.320496][ T0] ? kcryptd_queue_crypt+0x1be/0x380 [dm_crypt]
    [ 1194.333203][ T0] blk_update_request+0x6d7/0x1500
    [ 1194.344841][ T0] ? blk_mq_trigger_softirq+0x190/0x190
    [ 1194.356831][ T0] blk_mq_end_request+0x4b/0x490
    [ 1194.367994][ T0] ? blk_mq_trigger_softirq+0x190/0x190
    [ 1194.379693][ T0] flush_smp_call_function_queue+0x24b/0x560
    [ 1194.391847][ T0] flush_smp_call_function_from_idle+0x59/0x80
    [ 1194.403969][ T0] do_idle+0x287/0x450
    [ 1194.413891][ T0] ? arch_cpu_idle_exit+0x40/0x40
    [ 1194.424716][ T0] ? lockdep_hardirqs_on_prepare+0x286/0x3f0
    [ 1194.436399][ T0] ? _raw_spin_unlock_irqrestore+0x39/0x40
    [ 1194.447759][ T0] cpu_startup_entry+0x19/0x20
    [ 1194.458038][ T0] secondary_startup_64_no_verify+0xb0/0xbb

    IO completion can be queued to a different CPU by the block subsystem as a "call
    single function/data". The CPU may run these routines from the idle task, but it
    does so with interrupts disabled.

    It is not a good idea to do decryption with irqs disabled even in an idle task
    context, so just defer it to a tasklet (as is done with requests from hard irqs).

    Fixes: 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd workqueues")
    Cc: stable@vger.kernel.org # v5.9+
    Signed-off-by: Ignat Korchagin
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Ignat Korchagin
     
  • commit 8e14f610159d524cd7aac37982826d3ef75c09e8 upstream.

    Sometimes, when dm-crypt executes decryption in a tasklet, we may get
    "BUG: KASAN: use-after-free in tasklet_action_common.constprop..."
    with a kasan-enabled kernel.

    When the decryption fully completes in the tasklet, dm-crypt will call
    bio_endio(), which in turn will call clone_endio() from dm.c core code. That
    function frees the resources associated with the bio, including per bio private
    structures. For dm-crypt it will free the current struct dm_crypt_io, which
    contains our tasklet object, causing use-after-free, when the tasklet is being
    dequeued by the kernel.

    To avoid this, do not call bio_endio() from the current tasklet context, but
    delay its execution to the dm-crypt IO workqueue.

    Fixes: 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd workqueues")
    Cc: # v5.9+
    Signed-off-by: Ignat Korchagin
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Ignat Korchagin
     
  • commit 8abec36d1274bbd5ae8f36f3658b9abb3db56c31 upstream.

    Commit 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd
    workqueues") made it possible for some code paths in dm-crypt to be
    executed in softirq context, when the underlying driver processes IO
    requests in interrupt/softirq context.

    When Crypto API backlogs a crypto request, dm-crypt uses
    wait_for_completion to avoid sending further requests to an already
    overloaded crypto driver. However, if the code is executing in softirq
    context, we might get the following stacktrace:

    [ 210.235213][ C0] BUG: scheduling while atomic: fio/2602/0x00000102
    [ 210.236701][ C0] Modules linked in:
    [ 210.237566][ C0] CPU: 0 PID: 2602 Comm: fio Tainted: G W 5.10.0+ #50
    [ 210.239292][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
    [ 210.241233][ C0] Call Trace:
    [ 210.241946][ C0]
    [ 210.242561][ C0] dump_stack+0x7d/0xa3
    [ 210.243466][ C0] __schedule_bug.cold+0xb3/0xc2
    [ 210.244539][ C0] __schedule+0x156f/0x20d0
    [ 210.245518][ C0] ? io_schedule_timeout+0x140/0x140
    [ 210.246660][ C0] schedule+0xd0/0x270
    [ 210.247541][ C0] schedule_timeout+0x1fb/0x280
    [ 210.248586][ C0] ? usleep_range+0x150/0x150
    [ 210.249624][ C0] ? unpoison_range+0x3a/0x60
    [ 210.250632][ C0] ? ____kasan_kmalloc.constprop.0+0x82/0xa0
    [ 210.251949][ C0] ? unpoison_range+0x3a/0x60
    [ 210.252958][ C0] ? __prepare_to_swait+0xa7/0x190
    [ 210.254067][ C0] do_wait_for_common+0x2ab/0x370
    [ 210.255158][ C0] ? usleep_range+0x150/0x150
    [ 210.256192][ C0] ? bit_wait_io_timeout+0x160/0x160
    [ 210.257358][ C0] ? blk_update_request+0x757/0x1150
    [ 210.258582][ C0] ? _raw_spin_lock_irq+0x82/0xd0
    [ 210.259674][ C0] ? _raw_read_unlock_irqrestore+0x30/0x30
    [ 210.260917][ C0] wait_for_completion+0x4c/0x90
    [ 210.261971][ C0] crypt_convert+0x19a6/0x4c00
    [ 210.263033][ C0] ? _raw_spin_lock_irqsave+0x87/0xe0
    [ 210.264193][ C0] ? kasan_set_track+0x1c/0x30
    [ 210.265191][ C0] ? crypt_iv_tcw_ctr+0x4a0/0x4a0
    [ 210.266283][ C0] ? kmem_cache_free+0x104/0x470
    [ 210.267363][ C0] ? crypt_endio+0x91/0x180
    [ 210.268327][ C0] kcryptd_crypt_read_convert+0x30e/0x420
    [ 210.269565][ C0] blk_update_request+0x757/0x1150
    [ 210.270563][ C0] blk_mq_end_request+0x4b/0x480
    [ 210.271680][ C0] blk_done_softirq+0x21d/0x340
    [ 210.272775][ C0] ? _raw_spin_lock+0x81/0xd0
    [ 210.273847][ C0] ? blk_mq_stop_hw_queue+0x30/0x30
    [ 210.275031][ C0] ? _raw_read_lock_irq+0x40/0x40
    [ 210.276182][ C0] __do_softirq+0x190/0x611
    [ 210.277203][ C0] ? handle_edge_irq+0x221/0xb60
    [ 210.278340][ C0] asm_call_irq_on_stack+0x12/0x20
    [ 210.279514][ C0]
    [ 210.280164][ C0] do_softirq_own_stack+0x37/0x40
    [ 210.281281][ C0] irq_exit_rcu+0x110/0x1b0
    [ 210.282286][ C0] common_interrupt+0x74/0x120
    [ 210.283376][ C0] asm_common_interrupt+0x1e/0x40
    [ 210.284496][ C0] RIP: 0010:_aesni_enc1+0x65/0xb0

    Fix this by making crypt_convert function reentrant from the point of
    a single bio and make dm-crypt defer further bio processing to a
    workqueue, if Crypto API backlogs a request in interrupt context.

    Fixes: 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd workqueues")
    Cc: stable@vger.kernel.org # v5.9+
    Signed-off-by: Ignat Korchagin
    Acked-by: Mikulas Patocka
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Ignat Korchagin
     
  • commit d68b29584c25dbacd01ed44a3e45abb35353f1de upstream.

    Commit 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd
    workqueues") made it possible for some code paths in dm-crypt to be
    executed in softirq context, when the underlying driver processes IO
    requests in interrupt/softirq context.

    In this case sometimes when allocating a new crypto request we may get
    a stacktrace like below:

    [ 210.103008][ C0] BUG: sleeping function called from invalid context at mm/mempool.c:381
    [ 210.104746][ C0] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2602, name: fio
    [ 210.106599][ C0] CPU: 0 PID: 2602 Comm: fio Tainted: G W 5.10.0+ #50
    [ 210.108331][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
    [ 210.110212][ C0] Call Trace:
    [ 210.110921][ C0]
    [ 210.111527][ C0] dump_stack+0x7d/0xa3
    [ 210.112411][ C0] ___might_sleep.cold+0x122/0x151
    [ 210.113527][ C0] mempool_alloc+0x16b/0x2f0
    [ 210.114524][ C0] ? __queue_work+0x515/0xde0
    [ 210.115553][ C0] ? mempool_resize+0x700/0x700
    [ 210.116586][ C0] ? crypt_endio+0x91/0x180
    [ 210.117479][ C0] ? blk_update_request+0x757/0x1150
    [ 210.118513][ C0] ? blk_mq_end_request+0x4b/0x480
    [ 210.119572][ C0] ? blk_done_softirq+0x21d/0x340
    [ 210.120628][ C0] ? __do_softirq+0x190/0x611
    [ 210.121626][ C0] crypt_convert+0x29f9/0x4c00
    [ 210.122668][ C0] ? _raw_spin_lock_irqsave+0x87/0xe0
    [ 210.123824][ C0] ? kasan_set_track+0x1c/0x30
    [ 210.124858][ C0] ? crypt_iv_tcw_ctr+0x4a0/0x4a0
    [ 210.125930][ C0] ? kmem_cache_free+0x104/0x470
    [ 210.126973][ C0] ? crypt_endio+0x91/0x180
    [ 210.127947][ C0] kcryptd_crypt_read_convert+0x30e/0x420
    [ 210.129165][ C0] blk_update_request+0x757/0x1150
    [ 210.130231][ C0] blk_mq_end_request+0x4b/0x480
    [ 210.131294][ C0] blk_done_softirq+0x21d/0x340
    [ 210.132332][ C0] ? _raw_spin_lock+0x81/0xd0
    [ 210.133289][ C0] ? blk_mq_stop_hw_queue+0x30/0x30
    [ 210.134399][ C0] ? _raw_read_lock_irq+0x40/0x40
    [ 210.135458][ C0] __do_softirq+0x190/0x611
    [ 210.136409][ C0] ? handle_edge_irq+0x221/0xb60
    [ 210.137447][ C0] asm_call_irq_on_stack+0x12/0x20
    [ 210.138507][ C0]
    [ 210.139118][ C0] do_softirq_own_stack+0x37/0x40
    [ 210.140191][ C0] irq_exit_rcu+0x110/0x1b0
    [ 210.141151][ C0] common_interrupt+0x74/0x120
    [ 210.142171][ C0] asm_common_interrupt+0x1e/0x40

    Fix this by allocating crypto requests with GFP_ATOMIC mask in
    interrupt context.

    Fixes: 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd workqueues")
    Cc: stable@vger.kernel.org # v5.9+
    Reported-by: Maciej S. Szmigiero
    Signed-off-by: Ignat Korchagin
    Acked-by: Mikulas Patocka
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Ignat Korchagin
     
  • commit 17ffc193cdc6dc7a613d00d8ad47fc1f801b9bf0 upstream.

    Advance the maximum number of arguments from 9 to 15 to account for
    all potential feature flags that may be supplied.

    Linux 4.19 added "meta_device"
    (356d9d52e1221ba0c9f10b8b38652f78a5298329) and "recalculate"
    (a3fcf7253139609bf9ff901fbf955fba047e75dd) flags.

    Commit 468dfca38b1a6fbdccd195d875599cb7c8875cd9 added
    "sectors_per_bit" and "bitmap_flush_interval".

    Commit 84597a44a9d86ac949900441cea7da0af0f2f473 added
    "allow_discards".

    And the commit d537858ac8aaf4311b51240893add2fc62003b97 added
    "fix_padding".

    Signed-off-by: Mikulas Patocka
    Cc: stable@vger.kernel.org # v4.19+
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Mikulas Patocka
     
  • commit 9b5948267adc9e689da609eb61cf7ed49cae5fa8 upstream.

    With external metadata device, flush requests are not passed down to the
    data device.

    Fix this by submitting the flush request in dm_integrity_flush_buffers. In
    order to not degrade performance, we overlap the data device flush with
    the metadata device flush.

    Reported-by: Lukas Straub
    Signed-off-by: Mikulas Patocka
    Cc: stable@vger.kernel.org
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Mikulas Patocka
     
  • commit fcc42338375a1e67b8568dbb558f8b784d0f3b01 upstream.

    If the origin device has a volatile write-back cache and the following
    events occur:

    1: After finishing merge operation of one set of exceptions,
    merge_callback() is invoked.
    2: Update the metadata in COW device tracking the merge completion.
    This update to COW device is flushed cleanly.
    3: System crashes and the origin device's cache where the recent
    merge was completed has not been flushed.

    During the next cycle when we read the metadata from the COW device,
    we will skip reading those metadata whose merge was completed in
    step (1). This will lead to data loss/corruption.

    To address this, flush the origin device post merge IO before
    updating the metadata.

    Cc: stable@vger.kernel.org
    Signed-off-by: Akilesh Kailash
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Akilesh Kailash
     
  • commit cc07d72bf350b77faeffee1c37bc52197171473f upstream.

    Block core warned that discard_granularity was 0 for dm-raid with
    personality of raid1. Reason is that raid_io_hints() was incorrectly
    special-casing raid1 rather than raid0.

    Fix raid_io_hints() by removing discard limits settings for
    raid1. Check for raid0 instead.

    Fixes: 61697a6abd24a ("dm: eliminate 'split_discard_bios' flag from DM target interface")
    Cc: stable@vger.kernel.org
    Reported-by: Zdenek Kabelac
    Reported-by: Mikulas Patocka
    Reported-by: Stephan Bärwolf
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Mike Snitzer
     

17 Jan, 2021

1 commit

  • commit 5342fd4255021ef0c4ce7be52eea1c4ebda11c63 upstream.

    If BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET is set in incompat feature
    set, it means the cache device is created with obsoleted layout with
    obso_bucket_site_hi. Now bcache does not support this feature bit, a new
    BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE incompat feature bit is added
    for a better layout to support large bucket size.

    For the legacy compatibility purpose, if a cache device created with
    obsoleted BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET feature bit, all bcache
    devices attached to this cache set should be set to read-only. Then the
    dirty data can be written back to backing device before re-create the
    cache device with BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE feature bit
    by the latest bcache-tools.

    This patch checks BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET feature bit
    when running a cache set and attach a bcache device to the cache set. If
    this bit is set,
    - When run a cache set, print an error kernel message to indicate all
    following attached bcache device will be read-only.
    - When attach a bcache device, print an error kernel message to indicate
    the attached bcache device will be read-only, and ask users to update
    to latest bcache-tools.

    Such change is only for cache device whose bucket size >= 32MB, this is
    for the zoned SSD and almost nobody uses such large bucket size at this
    moment. If you don't explicit set a large bucket size for a zoned SSD,
    such change is totally transparent to your bcache device.

    Fixes: ffa470327572 ("bcache: add bucket_size_hi into struct cache_sb_disk for large bucket")
    Signed-off-by: Coly Li
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Coly Li
     

13 Jan, 2021

3 commits

  • commit b16671e8f493e3df40b1fb0dff4078f391c5099a upstream.

    When large bucket feature was added, BCH_FEATURE_INCOMPAT_LARGE_BUCKET
    was introduced into the incompat feature set. It used bucket_size_hi
    (which was added at the tail of struct cache_sb_disk) to extend current
    16bit bucket size to 32bit with existing bucket_size in struct
    cache_sb_disk.

    This is not a good idea, there are two obvious problems,
    - Bucket size is always value power of 2, if store log2(bucket size) in
    existing bucket_size of struct cache_sb_disk, it is unnecessary to add
    bucket_size_hi.
    - Macro csum_set() assumes d[SB_JOURNAL_BUCKETS] is the last member in
    struct cache_sb_disk, bucket_size_hi was added after d[] which makes
    csum_set calculate an unexpected super block checksum.

    To fix the above problems, this patch introduces a new incompat feature
    bit BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE, when this bit is set, it
    means bucket_size in struct cache_sb_disk stores the order of power-of-2
    bucket size value. When user specifies a bucket size larger than 32768
    sectors, BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE will be set to
    incompat feature set, and bucket_size stores log2(bucket size) more
    than store the real bucket size value.

    The obsoleted BCH_FEATURE_INCOMPAT_LARGE_BUCKET won't be used anymore,
    it is renamed to BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET and still only
    recognized by kernel driver for legacy compatible purpose. The previous
    bucket_size_hi is renmaed to obso_bucket_size_hi in struct cache_sb_disk
    and not used in bcache-tools anymore.

    For cache device created with BCH_FEATURE_INCOMPAT_LARGE_BUCKET feature,
    bcache-tools and kernel driver still recognize the feature string and
    display it as "obso_large_bucket".

    With this change, the unnecessary extra space extend of bcache on-disk
    super block can be avoided, and csum_set() may generate expected check
    sum as well.

    Fixes: ffa470327572 ("bcache: add bucket_size_hi into struct cache_sb_disk for large bucket")
    Signed-off-by: Coly Li
    Cc: stable@vger.kernel.org # 5.9+
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Coly Li
     
  • commit 1dfc0686c29a9bbd3a446a29f9ccde3dec3bc75a upstream.

    This patch adds the check for features which is incompatible for
    current supported feature sets.

    Now if the bcache device created by bcache-tools has features that
    current kernel doesn't support, read_super() will fail with error
    messoage. E.g. if an unsupported incompatible feature detected,
    bcache register will fail with dmesg "bcache: register_bcache() error :
    Unsupported incompatible feature found".

    Fixes: d721a43ff69c ("bcache: increase super block version for cache device and backing device")
    Fixes: ffa470327572 ("bcache: add bucket_size_hi into struct cache_sb_disk for large bucket")
    Signed-off-by: Coly Li
    Cc: stable@vger.kernel.org # 5.9+
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Coly Li
     
  • commit f7b4943dea48a572ad751ce1f18a245d43debe7e upstream.

    This patch fixes the following typos,
    from BCH_FEATURE_COMPAT_SUUP to BCH_FEATURE_COMPAT_SUPP
    from BCH_FEATURE_INCOMPAT_SUUP to BCH_FEATURE_INCOMPAT_SUPP
    from BCH_FEATURE_INCOMPAT_SUUP to BCH_FEATURE_RO_COMPAT_SUPP

    Fixes: d721a43ff69c ("bcache: increase super block version for cache device and backing device")
    Fixes: ffa470327572 ("bcache: add bucket_size_hi into struct cache_sb_disk for large bucket")
    Signed-off-by: Coly Li
    Cc: stable@vger.kernel.org # 5.9+
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Coly Li
     

06 Jan, 2021

2 commits

  • [ Upstream commit 252bd1256396cebc6fc3526127fdb0b317601318 ]

    If emergency system shutdown is called, like by thermal shutdown,
    a dm device could be alive when the block device couldn't process
    I/O requests anymore. In this state, the handling of I/O errors
    by new dm I/O requests or by those already in-flight can lead to
    a verity corruption state, which is a misjudgment.

    So, skip verity work in response to I/O error when system is shutting
    down.

    Signed-off-by: Hyeongseok Kim
    Reviewed-by: Sami Tolvanen
    Signed-off-by: Mike Snitzer
    Signed-off-by: Sasha Levin

    Hyeongseok Kim
     
  • commit 93decc563637c4288380912eac0eb42fb246cc04 upstream.

    In __make_request() a new r10bio is allocated and passed to
    raid10_read_request(). The read_slot member of the bio is not
    initialized, and the raid10_read_request() uses it to index an
    array. This leads to occasional panics.

    Fix by initializing the field to invalid value and checking for
    valid value in raid10_read_request().

    Cc: stable@vger.kernel.org
    Signed-off-by: Kevin Vigor
    Signed-off-by: Song Liu
    Signed-off-by: Greg Kroah-Hartman

    Kevin Vigor
     

30 Dec, 2020

3 commits


26 Dec, 2020

1 commit

  • commit c731b84b51bf7fe83448bea8f56a6d55006b0615 upstream.

    Syzkaller reports a warning as belows.
    WARNING: CPU: 0 PID: 9647 at drivers/md/md.c:7169
    ...
    Call Trace:
    ...
    RIP: 0010:md_ioctl+0x4017/0x5980 drivers/md/md.c:7169
    RSP: 0018:ffff888096027950 EFLAGS: 00010293
    RAX: ffff88809322c380 RBX: 0000000000000932 RCX: ffffffff84e266f2
    RDX: 0000000000000000 RSI: ffffffff84e299f7 RDI: 0000000000000007
    RBP: ffff888096027bc0 R08: ffff88809322c380 R09: ffffed101341a482
    R10: ffff888096027940 R11: ffff88809a0d240f R12: 0000000000000932
    R13: ffff8880a2c14100 R14: ffff88809a0d2268 R15: ffff88809a0d2408
    __blkdev_driver_ioctl block/ioctl.c:304 [inline]
    blkdev_ioctl+0xece/0x1c10 block/ioctl.c:606
    block_ioctl+0xee/0x130 fs/block_dev.c:1930
    vfs_ioctl fs/ioctl.c:46 [inline]
    file_ioctl fs/ioctl.c:509 [inline]
    do_vfs_ioctl+0xd5f/0x1380 fs/ioctl.c:696
    ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
    __do_sys_ioctl fs/ioctl.c:720 [inline]
    __se_sys_ioctl fs/ioctl.c:718 [inline]
    __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
    do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    This is caused by a race between two concurrenct md_ioctl()s closing
    the array.
    CPU1 (md_ioctl()) CPU2 (md_ioctl())
    ------ ------
    set_bit(MD_CLOSING, &mddev->flags);
    did_set_md_closing = true;
    WARN_ON_ONCE(test_bit(MD_CLOSING,
    &mddev->flags));
    if(did_set_md_closing)
    clear_bit(MD_CLOSING, &mddev->flags);

    Fix the warning by returning immediately if the MD_CLOSING bit is set
    in &mddev->flags which indicates that the array is being closed.

    Fixes: 065e519e71b2 ("md: MD_CLOSING needs to be cleared after called md_set_readonly or do_md_stop")
    Reported-by: syzbot+1e46a0864c1a6e9bd3d8@syzkaller.appspotmail.com
    Cc: stable@vger.kernel.org
    Signed-off-by: Dae R. Jeong
    Signed-off-by: Song Liu
    Signed-off-by: Greg Kroah-Hartman

    Dae R. Jeong
     

15 Dec, 2020

2 commits


14 Dec, 2020

1 commit

  • Pull block fixes from Jens Axboe:
    "This should be it for 5.10.

    Mike and Song looked into the warning case, and thankfully it appears
    the fix was pretty trivial - we can just change the md device chunk
    type to unsigned int to get rid of it. They cannot currently be < 0,
    and nobody is checking for that either.

    We're reverting the discard changes as the corruption reports came in
    very late, and there's just no time to attempt to deal with it at this
    point. Reverting the changes in question is the right call for 5.10"

    * tag 'block-5.10-2020-12-12' of git://git.kernel.dk/linux-block:
    md: change mddev 'chunk_sectors' from int to unsigned
    Revert "md: add md_submit_discard_bio() for submitting discard bio"
    Revert "md/raid10: extend r10bio devs to raid disks"
    Revert "md/raid10: pull codes that wait for blocked dev into one function"
    Revert "md/raid10: improve raid10 discard request"
    Revert "md/raid10: improve discard request for far layout"
    Revert "dm raid: remove unnecessary discard limits for raid10"

    Linus Torvalds
     

13 Dec, 2020

1 commit

  • Commit e2782f560c29 ("Revert "dm raid: remove unnecessary discard
    limits for raid10"") exposed compiler warnings introduced by commit
    e0910c8e4f87 ("dm raid: fix discard limits for raid1 and raid10"):

    In file included from ./include/linux/kernel.h:14,
    from ./include/asm-generic/bug.h:20,
    from ./arch/x86/include/asm/bug.h:93,
    from ./include/linux/bug.h:5,
    from ./include/linux/mmdebug.h:5,
    from ./include/linux/gfp.h:5,
    from ./include/linux/slab.h:15,
    from drivers/md/dm-raid.c:8:
    drivers/md/dm-raid.c: In function ‘raid_io_hints’:
    ./include/linux/minmax.h:18:28: warning: comparison of distinct pointer types lacks a cast
    (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
    ^~
    ./include/linux/minmax.h:32:4: note: in expansion of macro ‘__typecheck’
    (__typecheck(x, y) && __no_side_effects(x, y))
    ^~~~~~~~~~~
    ./include/linux/minmax.h:42:24: note: in expansion of macro ‘__safe_cmp’
    __builtin_choose_expr(__safe_cmp(x, y), \
    ^~~~~~~~~~
    ./include/linux/minmax.h:51:19: note: in expansion of macro ‘__careful_cmp’
    #define min(x, y) __careful_cmp(x, y, max_discard_sectors = min_not_zero(rs->md.chunk_sectors,
    ^~~~~~~~~~~~

    Fix this by changing the chunk_sectors member of 'struct mddev' from
    int to 'unsigned int' to match the type used for the 'chunk_sectors'
    member of 'struct queue_limits'. Various MD code still uses 'int' but
    none of it appears to ever make use of signed int; and storing
    positive signed int in unsigned is perfectly safe.

    Reported-by: Song Liu
    Fixes: e2782f560c29 ("Revert "dm raid: remove unnecessary discard limits for raid10"")
    Fixes: e0910c8e4f87 ("dm raid: fix discard limits for raid1 and raid10")
    Cc: stable@vger,kernel.org # e0910c8e4f87 was marked for stable@
    Signed-off-by: Mike Snitzer
    Reviewed-by: Song Liu
    Signed-off-by: Jens Axboe

    Mike Snitzer
     

10 Dec, 2020

6 commits

  • This reverts commit 2628089b74d5a64bd0bcb5d247a18f78d7b6f4d0.

    Matthew Ruffell reported data corruption in raid10 due to the changes
    in discard handling [1]. Revert these changes before we find a proper fix.

    [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/
    Cc: Matthew Ruffell
    Cc: Xiao Ni
    Signed-off-by: Song Liu

    Song Liu
     
  • This reverts commit 8650a889017cb1f6ea6813ccf83a2e9f6fa49dd3.

    Matthew Ruffell reported data corruption in raid10 due to the changes
    in discard handling [1]. Revert these changes before we find a proper fix.

    [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/
    Cc: Matthew Ruffell
    Cc: Xiao Ni
    Signed-off-by: Song Liu

    Song Liu
     
  • This reverts commit f046f5d0d79cdb968f219ce249e497fd1accf484.

    Matthew Ruffell reported data corruption in raid10 due to the changes
    in discard handling [1]. Revert these changes before we find a proper fix.

    [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/
    Cc: Matthew Ruffell
    Cc: Xiao Ni
    Signed-off-by: Song Liu

    Song Liu
     
  • This reverts commit bcc90d280465ebd51ab8688be86e1f00c62dccf9.

    Matthew Ruffell reported data corruption in raid10 due to the changes
    in discard handling [1]. Revert these changes before we find a proper fix.

    [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/
    Cc: Matthew Ruffell
    Cc: Xiao Ni
    Signed-off-by: Song Liu

    Song Liu
     
  • This reverts commit d3ee2d8415a6256c1c41e1be36e80e640c3e6359.

    Matthew Ruffell reported data corruption in raid10 due to the changes
    in discard handling [1]. Revert these changes before we find a proper fix.

    [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/
    Cc: Matthew Ruffell
    Cc: Xiao Ni
    Signed-off-by: Song Liu

    Song Liu
     
  • This reverts commit f0e90b6c663a7e3b4736cb318c6c7c589f152c28.

    Matthew Ruffell reported data corruption in raid10 due to the changes
    in discard handling [1]. Revert these changes before we find a proper fix.

    [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/
    Cc: Matthew Ruffell
    Cc: Xiao Ni
    Cc: Mike Snitzer
    Acked-by: Mike Snitzer
    Signed-off-by: Song Liu

    Song Liu
     

05 Dec, 2020

3 commits

  • Fixes sparse warnings:
    drivers/md/dm.c:508:12: warning: context imbalance in 'dm_prepare_ioctl' - wrong count at exit
    drivers/md/dm.c:543:13: warning: context imbalance in 'dm_unprepare_ioctl' - wrong count at exit

    Fixes: 971888c46993f ("dm: hold DM table for duration of ioctl rather than use blkdev_get")
    Cc: stable@vger.kernel.org
    Signed-off-by: Mike Snitzer

    Mike Snitzer
     
  • Remove redundant dm_put_live_table() in dm_dax_zero_page_range() error
    path to fix sparse warning:
    drivers/md/dm.c:1208:9: warning: context imbalance in 'dm_dax_zero_page_range' - unexpected unlock

    Fixes: cdf6cdcd3b99a ("dm,dax: Add dax zero_page_range operation")
    Cc: stable@vger.kernel.org
    Signed-off-by: Mike Snitzer

    Mike Snitzer
     
  • Commit 882ec4e609c1 ("dm table: stack 'chunk_sectors' limit to account
    for target-specific splitting") caused a couple regressions:
    1) Using lcm_not_zero() when stacking chunk_sectors was a bug because
    chunk_sectors must reflect the most limited of all devices in the
    IO stack.
    2) DM targets that set max_io_len but that do _not_ provide an
    .iterate_devices method no longer had there IO split properly.

    And commit 5091cdec56fa ("dm: change max_io_len() to use
    blk_max_size_offset()") also caused a regression where DM no longer
    supported varied (per target) IO splitting. The implication being the
    potential for severely reduced performance for IO stacks that use a DM
    target like dm-cache to hide performance limitations of a slower
    device (e.g. one that requires 4K IO splitting).

    Coming full circle: Fix all these issues by discontinuing stacking
    chunk_sectors up using ti->max_io_len in dm_calculate_queue_limits(),
    add optional chunk_sectors override argument to blk_max_size_offset()
    and update DM's max_io_len() to pass ti->max_io_len to its
    blk_max_size_offset() call.

    Passing in an optional chunk_sectors override to blk_max_size_offset()
    allows for code reuse of block's centralized calculation for max IO
    size based on provided offset and split boundary.

    Fixes: 882ec4e609c1 ("dm table: stack 'chunk_sectors' limit to account for target-specific splitting")
    Fixes: 5091cdec56fa ("dm: change max_io_len() to use blk_max_size_offset()")
    Cc: stable@vger.kernel.org
    Reported-by: John Dorminy
    Reported-by: Bruce Johnston
    Reported-by: Kirill Tkhai
    Reviewed-by: John Dorminy
    Signed-off-by: Mike Snitzer
    Reviewed-by: Jens Axboe

    Mike Snitzer
     

02 Dec, 2020

4 commits

  • Building on arch/s390/ results in this build error:

    cc1: some warnings being treated as errors
    ../drivers/md/dm-writecache.c: In function 'persistent_memory_claim':
    ../drivers/md/dm-writecache.c:323:1: error: no return statement in function returning non-void [-Werror=return-type]

    Fix this by replacing the BUG() with an -EOPNOTSUPP return.

    Fixes: 48debafe4f2f ("dm: add writecache target")
    Reported-by: Randy Dunlap
    Signed-off-by: Mike Snitzer

    Mike Snitzer
     
  • The BUG_ON(in_interrupt()) in dm_table_event() is a historic leftover from
    a rework of the dm table code which changed the calling context.

    Issuing a BUG for a wrong calling context is frowned upon and
    in_interrupt() is deprecated and only covering parts of the wrong
    contexts. The sanity check for the context is covered by
    CONFIG_DEBUG_ATOMIC_SLEEP and other debug facilities already.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Mike Snitzer

    Thomas Gleixner
     
  • The dm_get_live_table() function makes RCU read lock so
    dm_put_live_table() must be called even if dm_table map is not found.

    Fixes: e76239a3748c9 ("block: add a report_zones method")
    Cc: stable@vger.kernel.org
    Signed-off-by: Sergei Shtepa
    Signed-off-by: Mike Snitzer

    Sergei Shtepa
     
  • This reverts commit 43aeaa29573924df76f44eda2bbd94ca36e407b5.

    Since commit 0bddd227f3dc ("Documentation: update for gcc 4.9 requirement")
    the minimum supported version of GCC is gcc-4.9. It's now safe to remove
    this code.

    Link: https://github.com/ClangBuiltLinux/linux/issues/427
    Signed-off-by: Nick Desaulniers
    Acked-by: Mikulas Patocka
    Signed-off-by: Mike Snitzer

    Nick Desaulniers
     

17 Nov, 2020

2 commits