09 Oct, 2020

3 commits

  • Callers of get_bitmap_from_slot() are responsible to free the bitmap.

    Suggested-by: Guoqing Jiang
    Signed-off-by: Zhao Heming
    Signed-off-by: Song Liu

    Zhao Heming
     
  • md_bitmap_get_counter() has code:

    ```
    if (bitmap->bp[page].hijacked ||
    bitmap->bp[page].map == NULL)
    csize = ((sector_t)1) << (bitmap->chunkshift +
    PAGE_COUNTER_SHIFT - 1);
    ```

    The minus 1 is wrong, this branch should report 2048 bits of space.
    With "-1" action, this only report 1024 bit of space.

    This bug code returns wrong blocks, but it doesn't inflence bitmap logic:
    1. Most callers focus this function return value (the counter of offset),
    not the parameter blocks.
    2. The bug is only triggered when hijacked is true or map is NULL.
    the hijacked true condition is very rare.
    the "map == null" only true when array is creating or resizing.
    3. Even the caller gets wrong blocks, current code makes caller just to
    call md_bitmap_get_counter() one more time.

    Signed-off-by: Zhao Heming
    Signed-off-by: Song Liu

    Zhao Heming
     
  • The patched code is used to get chunks number, should use round-up div
    to replace current sector_div. The same code is in md_bitmap_resize():
    ```
    chunks = DIV_ROUND_UP_SECTOR_T(blocks, 1 << chunkshift);
    ```

    Signed-off-by: Zhao Heming
    Signed-off-by: Song Liu

    Zhao Heming
     

25 Sep, 2020

1 commit


24 Aug, 2020

1 commit

  • Replace the existing /* fall through */ comments and its variants with
    the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
    fall-through markings when it is the case.

    [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     

15 Jul, 2020

1 commit

  • The following deadlock was captured. The first process is holding 'kernfs_mutex'
    and hung by io. The io was staging in 'r1conf.pending_bio_list' of raid1 device,
    this pending bio list would be flushed by second process 'md127_raid1', but
    it was hung by 'kernfs_mutex'. Using sysfs_notify_dirent_safe() to replace
    sysfs_notify() can fix it. There were other sysfs_notify() invoked from io
    path, removed all of them.

    PID: 40430 TASK: ffff8ee9c8c65c40 CPU: 29 COMMAND: "probe_file"
    #0 [ffffb87c4df37260] __schedule at ffffffff9a8678ec
    #1 [ffffb87c4df372f8] schedule at ffffffff9a867f06
    #2 [ffffb87c4df37310] io_schedule at ffffffff9a0c73e6
    #3 [ffffb87c4df37328] __dta___xfs_iunpin_wait_3443 at ffffffffc03a4057 [xfs]
    #4 [ffffb87c4df373a0] xfs_iunpin_wait at ffffffffc03a6c79 [xfs]
    #5 [ffffb87c4df373b0] __dta_xfs_reclaim_inode_3357 at ffffffffc039a46c [xfs]
    #6 [ffffb87c4df37400] xfs_reclaim_inodes_ag at ffffffffc039a8b6 [xfs]
    #7 [ffffb87c4df37590] xfs_reclaim_inodes_nr at ffffffffc039bb33 [xfs]
    #8 [ffffb87c4df375b0] xfs_fs_free_cached_objects at ffffffffc03af0e9 [xfs]
    #9 [ffffb87c4df375c0] super_cache_scan at ffffffff9a287ec7
    #10 [ffffb87c4df37618] shrink_slab at ffffffff9a1efd93
    #11 [ffffb87c4df37700] shrink_node at ffffffff9a1f5968
    #12 [ffffb87c4df37788] do_try_to_free_pages at ffffffff9a1f5ea2
    #13 [ffffb87c4df377f0] try_to_free_mem_cgroup_pages at ffffffff9a1f6445
    #14 [ffffb87c4df37880] try_charge at ffffffff9a26cc5f
    #15 [ffffb87c4df37920] memcg_kmem_charge_memcg at ffffffff9a270f6a
    #16 [ffffb87c4df37958] new_slab at ffffffff9a251430
    #17 [ffffb87c4df379c0] ___slab_alloc at ffffffff9a251c85
    #18 [ffffb87c4df37a80] __slab_alloc at ffffffff9a25635d
    #19 [ffffb87c4df37ac0] kmem_cache_alloc at ffffffff9a251f89
    #20 [ffffb87c4df37b00] alloc_inode at ffffffff9a2a2b10
    #21 [ffffb87c4df37b20] iget_locked at ffffffff9a2a4854
    #22 [ffffb87c4df37b60] kernfs_get_inode at ffffffff9a311377
    #23 [ffffb87c4df37b80] kernfs_iop_lookup at ffffffff9a311e2b
    #24 [ffffb87c4df37ba8] lookup_slow at ffffffff9a290118
    #25 [ffffb87c4df37c10] walk_component at ffffffff9a291e83
    #26 [ffffb87c4df37c78] path_lookupat at ffffffff9a293619
    #27 [ffffb87c4df37cd8] filename_lookup at ffffffff9a2953af
    #28 [ffffb87c4df37de8] user_path_at_empty at ffffffff9a295566
    #29 [ffffb87c4df37e10] vfs_statx at ffffffff9a289787
    #30 [ffffb87c4df37e70] SYSC_newlstat at ffffffff9a289d5d
    #31 [ffffb87c4df37f18] sys_newlstat at ffffffff9a28a60e
    #32 [ffffb87c4df37f28] do_syscall_64 at ffffffff9a003949
    #33 [ffffb87c4df37f50] entry_SYSCALL_64_after_hwframe at ffffffff9aa001ad
    RIP: 00007f617a5f2905 RSP: 00007f607334f838 RFLAGS: 00000246
    RAX: ffffffffffffffda RBX: 00007f6064044b20 RCX: 00007f617a5f2905
    RDX: 00007f6064044b20 RSI: 00007f6064044b20 RDI: 00007f6064005890
    RBP: 00007f6064044aa0 R8: 0000000000000030 R9: 000000000000011c
    R10: 0000000000000013 R11: 0000000000000246 R12: 00007f606417e6d0
    R13: 00007f6064044aa0 R14: 00007f6064044b10 R15: 00000000ffffffff
    ORIG_RAX: 0000000000000006 CS: 0033 SS: 002b

    PID: 927 TASK: ffff8f15ac5dbd80 CPU: 42 COMMAND: "md127_raid1"
    #0 [ffffb87c4df07b28] __schedule at ffffffff9a8678ec
    #1 [ffffb87c4df07bc0] schedule at ffffffff9a867f06
    #2 [ffffb87c4df07bd8] schedule_preempt_disabled at ffffffff9a86825e
    #3 [ffffb87c4df07be8] __mutex_lock at ffffffff9a869bcc
    #4 [ffffb87c4df07ca0] __mutex_lock_slowpath at ffffffff9a86a013
    #5 [ffffb87c4df07cb0] mutex_lock at ffffffff9a86a04f
    #6 [ffffb87c4df07cc8] kernfs_find_and_get_ns at ffffffff9a311d83
    #7 [ffffb87c4df07cf0] sysfs_notify at ffffffff9a314b3a
    #8 [ffffb87c4df07d18] md_update_sb at ffffffff9a688696
    #9 [ffffb87c4df07d98] md_update_sb at ffffffff9a6886d5
    #10 [ffffb87c4df07da8] md_check_recovery at ffffffff9a68ad9c
    #11 [ffffb87c4df07dd0] raid1d at ffffffffc01f0375 [raid1]
    #12 [ffffb87c4df07ea0] md_thread at ffffffff9a680348
    #13 [ffffb87c4df07f08] kthread at ffffffff9a0b8005
    #14 [ffffb87c4df07f50] ret_from_fork at ffffffff9aa00344

    Signed-off-by: Junxiao Bi
    Signed-off-by: Song Liu

    Junxiao Bi
     

03 Jun, 2020

1 commit


09 Feb, 2020

1 commit

  • Pull misc vfs updates from Al Viro:

    - bmap series from cmaiolino

    - getting rid of convolutions in copy_mount_options() (use a couple of
    copy_from_user() instead of the __get_user() crap)

    * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    saner copy_mount_options()
    fibmap: Reject negative block numbers
    fibmap: Use bmap instead of ->bmap method in ioctl_fibmap
    ecryptfs: drop direct calls to ->bmap
    cachefiles: drop direct usage of ->bmap method.
    fs: Enable bmap() function to properly return errors

    Linus Torvalds
     

03 Feb, 2020

1 commit

  • By now, bmap() will either return the physical block number related to
    the requested file offset or 0 in case of error or the requested offset
    maps into a hole.
    This patch makes the needed changes to enable bmap() to proper return
    errors, using the return value as an error return, and now, a pointer
    must be passed to bmap() to be filled with the mapped physical block.

    It will change the behavior of bmap() on return:

    - negative value in case of error
    - zero on success or map fell into a hole

    In case of a hole, the *block will be zero too

    Since this is a prep patch, by now, the only error return is -EINVAL if
    ->bmap doesn't exist.

    Reviewed-by: Christoph Hellwig
    Signed-off-by: Carlos Maiolino
    Signed-off-by: Al Viro

    Carlos Maiolino
     

14 Jan, 2020

4 commits

  • Obviously, IO serialization could cause the degradation of
    performance a lot. In order to reduce the degradation, so a
    rb interval tree is added in raid1 to speed up the check of
    collision.

    So, a rb root is needed in md_rdev, then abstract all the
    serialize related members to a new struct (serial_in_rdev),
    embed it into md_rdev.

    Of course, we need to free the struct if it is not needed
    anymore, so rdev/rdevs_uninit_serial are added accordingly.
    And they should be called when destroty memory pool or can't
    alloc memory.

    And we need to consider to call mddev_destroy_serial_pool
    in case serialize_policy/write-behind is disabled, bitmap
    is destroyed or in __md_stop_writes.

    Signed-off-by: Guoqing Jiang
    Signed-off-by: Song Liu

    Guoqing Jiang
     
  • The serial_info_pool is needed if array sets serialize_policy to
    true, so don't destroy it.

    Signed-off-by: Guoqing Jiang
    Signed-off-by: Song Liu

    Guoqing Jiang
     
  • Previously, wb_info_pool and wb_list stuffs are introduced
    to address potential data inconsistence issue for write
    behind device.

    Now rename them to serial related name, since the same
    mechanism will be used to address reorder overlap write
    issue for raid1.

    Signed-off-by: Guoqing Jiang
    Signed-off-by: Song Liu

    Guoqing Jiang
     
  • In md_bitmap_unplug, bitmap->storage.filemap is double checked.

    In md_bitmap_daemon_work, bitmap->storage.filemap should be checked
    before reference.

    Signed-off-by: Zhiqiang Liu
    Signed-off-by: Song Liu

    Zhiqiang Liu
     

25 Oct, 2019

1 commit

  • We need to move "spin_lock_irq(&bitmap->counts.lock)" before unmap previous
    storage, otherwise panic like belows could happen as follows.

    [ 902.353802] sdl: detected capacity change from 1077936128 to 3221225472
    [ 902.616948] general protection fault: 0000 [#1] SMP
    [snip]
    [ 902.618588] CPU: 12 PID: 33698 Comm: md0_raid1 Tainted: G O 4.14.144-1-pserver #4.14.144-1.1~deb10
    [ 902.618870] Hardware name: Supermicro SBA-7142G-T4/BHQGE, BIOS 3.00 10/24/2012
    [ 902.619120] task: ffff9ae1860fc600 task.stack: ffffb52e4c704000
    [ 902.619301] RIP: 0010:bitmap_file_clear_bit+0x90/0xd0 [md_mod]
    [ 902.619464] RSP: 0018:ffffb52e4c707d28 EFLAGS: 00010087
    [ 902.619626] RAX: ffe8008b0d061000 RBX: ffff9ad078c87300 RCX: 0000000000000000
    [ 902.619792] RDX: ffff9ad986341868 RSI: 0000000000000803 RDI: ffff9ad078c87300
    [ 902.619986] RBP: ffff9ad0ed7a8000 R08: 0000000000000000 R09: 0000000000000000
    [ 902.620154] R10: ffffb52e4c707ec0 R11: ffff9ad987d1ed44 R12: ffff9ad0ed7a8360
    [ 902.620320] R13: 0000000000000003 R14: 0000000000060000 R15: 0000000000000800
    [ 902.620487] FS: 0000000000000000(0000) GS:ffff9ad987d00000(0000) knlGS:0000000000000000
    [ 902.620738] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 902.620901] CR2: 000055ff12aecec0 CR3: 0000001005207000 CR4: 00000000000406e0
    [ 902.621068] Call Trace:
    [ 902.621256] bitmap_daemon_work+0x2dd/0x360 [md_mod]
    [ 902.621429] ? find_pers+0x70/0x70 [md_mod]
    [ 902.621597] md_check_recovery+0x51/0x540 [md_mod]
    [ 902.621762] raid1d+0x5c/0xeb0 [raid1]
    [ 902.621939] ? try_to_del_timer_sync+0x4d/0x80
    [ 902.622102] ? del_timer_sync+0x35/0x40
    [ 902.622265] ? schedule_timeout+0x177/0x360
    [ 902.622453] ? call_timer_fn+0x130/0x130
    [ 902.622623] ? find_pers+0x70/0x70 [md_mod]
    [ 902.622794] ? md_thread+0x94/0x150 [md_mod]
    [ 902.622959] md_thread+0x94/0x150 [md_mod]
    [ 902.623121] ? wait_woken+0x80/0x80
    [ 902.623280] kthread+0x119/0x130
    [ 902.623437] ? kthread_create_on_node+0x60/0x60
    [ 902.623600] ret_from_fork+0x22/0x40
    [ 902.624225] RIP: bitmap_file_clear_bit+0x90/0xd0 [md_mod] RSP: ffffb52e4c707d28

    Because mdadm was running on another cpu to do resize, so bitmap_resize was
    called to replace bitmap as below shows.

    PID: 38801 TASK: ffff9ad074a90e00 CPU: 0 COMMAND: "mdadm"
    [exception RIP: queued_spin_lock_slowpath+56]
    [snip]
    -- --
    #5 [ffffb52e60f17c58] queued_spin_lock_slowpath at ffffffff9c0b27b8
    #6 [ffffb52e60f17c58] bitmap_resize at ffffffffc0399877 [md_mod]
    #7 [ffffb52e60f17d30] raid1_resize at ffffffffc0285bf9 [raid1]
    #8 [ffffb52e60f17d50] update_size at ffffffffc038a31a [md_mod]
    #9 [ffffb52e60f17d70] md_ioctl at ffffffffc0395ca4 [md_mod]

    And the procedure to keep resize bitmap safe is allocate new storage
    space, then quiesce, copy bits, replace bitmap, and re-start.

    However the daemon (bitmap_daemon_work) could happen even the array is
    quiesced, which means when bitmap_file_clear_bit is triggered by raid1d,
    then it thinks it should be fine to access store->filemap since
    counts->lock is held, but resize could change the storage without the
    protection of the lock.

    Cc: Jack Wang
    Cc: NeilBrown
    Signed-off-by: Guoqing Jiang
    Signed-off-by: Song Liu

    Guoqing Jiang
     

21 Jun, 2019

2 commits

  • The write-behind attribute is part of bitmap, since bitmap
    can be added/removed dynamically with the following.

    1. mdadm --grow /dev/md0 --bitmap=none
    2. mdadm --grow /dev/md0 --bitmap=internal --write-behind

    So we need to destroy wb_info_pool in md_bitmap_destroy,
    and create the pool before load bitmap.

    Reviewed-by: NeilBrown
    Signed-off-by: Guoqing Jiang
    Signed-off-by: Song Liu

    Guoqing Jiang
     
  • Since we can enable write-behind mode by write backlog node,
    so create wb_info_pool if the mode is just enabled, also call
    call md_bitmap_update_sb to make user aware the write-behind
    mode is enabled. Conversely, wb_info_pool should be destroyed
    when write-behind mode is disabled.

    Beside above, it is better to update bitmap sb if we change
    the number of max_write_behind.

    Reviewed-by: NeilBrown
    Signed-off-by: Guoqing Jiang
    Signed-off-by: Song Liu

    Guoqing Jiang
     

21 May, 2019

1 commit

  • Add SPDX license identifiers to all files which:

    - Have no license information of any form

    - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
    initial scan/conversion to ignore the file

    These files fall under the project license, GPL v2 only. The resulting SPDX
    license identifier is:

    GPL-2.0-only

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

11 Apr, 2019

1 commit


11 Oct, 2018

1 commit

  • After 9e1cc0a54556 ("md: use mddev_suspend/resume instead of ->quiesce()")
    We still have similar left in bitmap functions.

    Replace quiesce() with mddev_suspend/resume.

    Also move md_bitmap_create out of mddev_suspend. and move mddev_resume
    after md_bitmap_destroy. as we did in set_bitmap_file.

    Signed-off-by: Jack Wang
    Reviewed-by: Gioh Kim
    Signed-off-by: Shaohua Li

    Jack Wang
     

19 Aug, 2018

1 commit

  • Pull input updates from Dmitry Torokhov:

    - a new driver for Rohm BU21029 touch controller

    - new bitmap APIs: bitmap_alloc, bitmap_zalloc and bitmap_free

    - updates to Atmel, eeti. pxrc and iforce drivers

    - assorted driver cleanups and fixes.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (57 commits)
    MAINTAINERS: Add PhoenixRC Flight Controller Adapter
    Input: do not use WARN() in input_alloc_absinfo()
    Input: mark expected switch fall-throughs
    Input: raydium_i2c_ts - use true and false for boolean values
    Input: evdev - switch to bitmap API
    Input: gpio-keys - switch to bitmap_zalloc()
    Input: elan_i2c_smbus - cast sizeof to int for comparison
    bitmap: Add bitmap_alloc(), bitmap_zalloc() and bitmap_free()
    md: Avoid namespace collision with bitmap API
    dm: Avoid namespace collision with bitmap API
    Input: pm8941-pwrkey - add resin entry
    Input: pm8941-pwrkey - abstract register offsets and event code
    Input: iforce - reorganize joystick configuration lists
    Input: atmel_mxt_ts - move completion to after config crc is updated
    Input: atmel_mxt_ts - don't report zero pressure from T9
    Input: atmel_mxt_ts - zero terminate config firmware file
    Input: atmel_mxt_ts - refactor config update code to add context struct
    Input: atmel_mxt_ts - config CRC may start at T71
    Input: atmel_mxt_ts - remove unnecessary debug on ENOMEM
    Input: atmel_mxt_ts - remove duplicate setup of ABS_MT_PRESSURE
    ...

    Linus Torvalds
     

02 Aug, 2018

1 commit

  • bitmap API (include/linux/bitmap.h) has 'bitmap' prefix for its methods.

    On the other hand MD bitmap API is special case.
    Adding 'md' prefix to it to avoid name space collision.

    No functional changes intended.

    Signed-off-by: Andy Shevchenko
    Acked-by: Shaohua Li
    Signed-off-by: Dmitry Torokhov

    Andy Shevchenko
     

13 Jun, 2018

2 commits

  • The kzalloc() function has a 2-factor argument form, kcalloc(). This
    patch replaces cases of:

    kzalloc(a * b, gfp)

    with:
    kcalloc(a * b, gfp)

    as well as handling cases of:

    kzalloc(a * b * c, gfp)

    with:

    kzalloc(array3_size(a, b, c), gfp)

    as it's slightly less ugly than:

    kzalloc_array(array_size(a, b), c, gfp)

    This does, however, attempt to ignore constant size factors like:

    kzalloc(4 * 1024, gfp)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    kzalloc(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    kzalloc(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    kzalloc(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (COUNT_ID)
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * COUNT_ID
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (COUNT_CONST)
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * COUNT_CONST
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (COUNT_ID)
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * COUNT_ID
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (COUNT_CONST)
    + COUNT_CONST, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * COUNT_CONST
    + COUNT_CONST, sizeof(THING)
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    - kzalloc
    + kcalloc
    (
    - SIZE * COUNT
    + COUNT, SIZE
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    kzalloc(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    kzalloc(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kzalloc(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    kzalloc(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products,
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    kzalloc(C1 * C2 * C3, ...)
    |
    kzalloc(
    - (E1) * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - (E1) * (E2) * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - (E1) * (E2) * (E3)
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants,
    // keeping sizeof() as the second factor argument.
    @@
    expression THING, E1, E2;
    type TYPE;
    constant C1, C2, C3;
    @@

    (
    kzalloc(sizeof(THING) * C2, ...)
    |
    kzalloc(sizeof(TYPE) * C2, ...)
    |
    kzalloc(C1 * C2 * C3, ...)
    |
    kzalloc(C1 * C2, ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (E2)
    + E2, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * E2
    + E2, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (E2)
    + E2, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * E2
    + E2, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - (E1) * E2
    + E1, E2
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - (E1) * (E2)
    + E1, E2
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - E1 * E2
    + E1, E2
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook
     
  • The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
    patch replaces cases of:

    kmalloc(a * b, gfp)

    with:
    kmalloc_array(a * b, gfp)

    as well as handling cases of:

    kmalloc(a * b * c, gfp)

    with:

    kmalloc(array3_size(a, b, c), gfp)

    as it's slightly less ugly than:

    kmalloc_array(array_size(a, b), c, gfp)

    This does, however, attempt to ignore constant size factors like:

    kmalloc(4 * 1024, gfp)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The tools/ directory was manually excluded, since it has its own
    implementation of kmalloc().

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    kmalloc(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    kmalloc(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    kmalloc(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (COUNT_ID)
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * COUNT_ID
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (COUNT_CONST)
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * COUNT_CONST
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (COUNT_ID)
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * COUNT_ID
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (COUNT_CONST)
    + COUNT_CONST, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * COUNT_CONST
    + COUNT_CONST, sizeof(THING)
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    - kmalloc
    + kmalloc_array
    (
    - SIZE * COUNT
    + COUNT, SIZE
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    kmalloc(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    kmalloc(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kmalloc(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    kmalloc(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products,
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    kmalloc(C1 * C2 * C3, ...)
    |
    kmalloc(
    - (E1) * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - (E1) * (E2) * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - (E1) * (E2) * (E3)
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants,
    // keeping sizeof() as the second factor argument.
    @@
    expression THING, E1, E2;
    type TYPE;
    constant C1, C2, C3;
    @@

    (
    kmalloc(sizeof(THING) * C2, ...)
    |
    kmalloc(sizeof(TYPE) * C2, ...)
    |
    kmalloc(C1 * C2 * C3, ...)
    |
    kmalloc(C1 * C2, ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (E2)
    + E2, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * E2
    + E2, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (E2)
    + E2, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * E2
    + E2, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - (E1) * E2
    + E1, E2
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - (E1) * (E2)
    + E1, E2
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - E1 * E2
    + E1, E2
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook
     

15 Nov, 2017

1 commit

  • Pull MD update from Shaohua Li:
    "This update mostly includes bug fixes:

    - md-cluster now supports raid10 from Guoqing

    - raid5 PPL fixes from Artur

    - badblock regression fix from Bo

    - suspend hang related fixes from Neil

    - raid5 reshape fixes from Neil

    - raid1 freeze deadlock fix from Nate

    - memleak fixes from Zdenek

    - bitmap related fixes from Me and Tao

    - other fixes and cleanups"

    * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md: (33 commits)
    md: free unused memory after bitmap resize
    md: release allocated bitset sync_set
    md/bitmap: clear BITMAP_WRITE_ERROR bit before writing it to sb
    md: be cautious about using ->curr_resync_completed for ->recovery_offset
    badblocks: fix wrong return value in badblocks_set if badblocks are disabled
    md: don't check MD_SB_CHANGE_CLEAN in md_allow_write
    md-cluster: update document for raid10
    md: remove redundant variable q
    raid1: remove obsolete code in raid1_write_request
    md-cluster: Use a small window for raid10 resync
    md-cluster: Suspend writes in RAID10 if within range
    md-cluster/raid10: set "do_balance = 0" if area is resyncing
    md: use lockdep_assert_held
    raid1: prevent freeze_array/wait_all_barriers deadlock
    md: use TASK_IDLE instead of blocking signals
    md: remove special meaning of ->quiesce(.., 2)
    md: allow metadata update while suspending.
    md: use mddev_suspend/resume instead of ->quiesce()
    md: move suspend_hi/lo handling into core md code
    md: don't call bitmap_create() while array is quiesced.
    ...

    Linus Torvalds
     

11 Nov, 2017

1 commit

  • When bitmap is resized, the old kalloced chunks just are not released
    once the resized bitmap starts to use new space.

    This fixes in particular kmemleak reports like this one:

    unreferenced object 0xffff8f4311e9c000 (size 4096):
    comm "lvm", pid 19333, jiffies 4295263268 (age 528.265s)
    hex dump (first 32 bytes):
    02 80 02 80 02 80 02 80 02 80 02 80 02 80 02 80 ................
    02 80 02 80 02 80 02 80 02 80 02 80 02 80 02 80 ................
    backtrace:
    [] kmemleak_alloc+0x4a/0xa0
    [] kmem_cache_alloc_trace+0x14e/0x2e0
    [] bitmap_checkpage+0x7c/0x110
    [] bitmap_get_counter+0x45/0xd0
    [] bitmap_set_memory_bits+0x43/0xe0
    [] bitmap_init_from_disk+0x23c/0x530
    [] bitmap_load+0xbe/0x160
    [] raid_preresume+0x203/0x2f0 [dm_raid]
    [] dm_table_resume_targets+0x4f/0xe0
    [] dm_resume+0x122/0x140
    [] dev_suspend+0x18f/0x290
    [] ctl_ioctl+0x287/0x560
    [] dm_ctl_ioctl+0x13/0x20
    [] do_vfs_ioctl+0xa6/0x750
    [] SyS_ioctl+0x79/0x90
    [] entry_SYSCALL_64_fastpath+0x1f/0xc2

    Signed-off-by: Zdenek Kabelac
    Signed-off-by: Shaohua Li

    Zdenek Kabelac
     

09 Nov, 2017

1 commit

  • For a RAID1 device using a file-based bitmap, if a bitmap write error
    occurs but the later writes succeed, it's possible both BITMAP_STALE
    and BITMAP_WRITE_ERROR bits will be written to the bitmap super block,
    the BITMAP_STALE bit will be handled properly and be cleared, but the
    BITMAP_WRITE_ERROR bit in sb->flags will make bitmap_create() to fail.

    So clear it to protect against the write failure-and-then-recovery case.

    Signed-off-by: Hou Tao
    Signed-off-by: Shaohua Li

    Hou Tao
     

02 Nov, 2017

1 commit

  • Having both a bitmap and a journal is pointless.
    Attempting to do so can corrupt the bitmap if the journal
    replay happens before the bitmap is initialized.
    Rather than try to avoid this corruption, simply
    refuse to allow arrays with both a bitmap and a journal.
    So:
    - if raid5_run sees both are present, fail.
    - if adding a bitmap finds a journal is present, fail
    - if adding a journal finds a bitmap is present, fail.

    Cc: stable@vger.kernel.org (4.10+)
    Signed-off-by: NeilBrown
    Tested-by: Joshua Kinard
    Acked-by: Joshua Kinard
    Signed-off-by: Shaohua Li

    NeilBrown
     

17 Oct, 2017

1 commit

  • Motivated by the desire to illiminate the imprecise nature of
    DM-specific patches being unnecessarily sent to both the MD maintainer
    and mailing-list. Which is born out of the fact that DM files also
    reside in drivers/md/

    Now all MD-specific files in drivers/md/ start with either "raid" or
    "md-" and the MAINTAINERS file has been updated accordingly.

    Shaohua: don't change module name

    Signed-off-by: Mike Snitzer
    Signed-off-by: Shaohua Li

    Mike Snitzer