15 Jul, 2018

1 commit


11 Jul, 2018

1 commit

  • Fix a regression introduced in Linux kernel 4.17 where sending a SCSI
    command that does not transfer data (such as TEST UNIT READY) via
    /dev/bsg/* results in EINVAL.

    Fixes: 17cb960f29c2 ("bsg: split handling of SCSI CDBs vs transport requeues")
    Cc: # 4.17+
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Tony Battersby
    Signed-off-by: Jens Axboe

    Tony Battersby
     

01 Jul, 2018

1 commit

  • Pull block fixes from Jens Axboe:
    "Small set of fixes for this series. Mostly just minor fixes, the only
    oddball in here is the sg change.

    The sg change came out of the stall fix for NVMe, where we added a
    mempool and limited us to a single page allocation. CONFIG_SG_DEBUG
    sort-of ruins that, since we'd need to account for that. That's
    actually a generic problem, since lots of drivers need to allocate SG
    lists. So this just removes support for CONFIG_SG_DEBUG, which I added
    back in 2007 and to my knowledge it was never useful.

    Anyway, outside of that, this pull contains:

    - clone of request with special payload fix (Bart)

    - drbd discard handling fix (Bart)

    - SATA blk-mq stall fix (me)

    - chunk size fix (Keith)

    - double free nvme rdma fix (Sagi)"

    * tag 'for-linus-20180629' of git://git.kernel.dk/linux-block:
    sg: remove ->sg_magic member
    drbd: Fix drbd_request_prepare() discard handling
    blk-mq: don't queue more if we get a busy return
    block: Fix cloning of requests with a special payload
    nvme-rdma: fix possible double free of controller async event buffer
    block: Fix transfer when chunk sectors exceeds max

    Linus Torvalds
     

29 Jun, 2018

1 commit

  • Some devices have different queue limits depending on the type of IO. A
    classic case is SATA NCQ, where some commands can queue, but others
    cannot. If we have NCQ commands inflight and encounter a non-queueable
    command, the driver returns busy. Currently we attempt to dispatch more
    from the scheduler, if we were able to queue some commands. But for the
    case where we ended up stopping due to BUSY, we should not attempt to
    retrieve more from the scheduler. If we do, we can get into a situation
    where we attempt to queue a non-queueable command, get BUSY, then
    successfully retrieve more commands from that scheduler and queue those.
    This can repeat forever, starving the non-queuable command indefinitely.

    Fix this by NOT attempting to pull more commands from the scheduler, if
    we get a BUSY return. This should also be more optimal in terms of
    letting requests stay in the scheduler for as long as possible, if we
    get a BUSY due to the regular out-of-tags condition.

    Reviewed-by: Omar Sandoval
    Reviewed-by: Ming Lei
    Signed-off-by: Jens Axboe

    Jens Axboe
     

28 Jun, 2018

1 commit

  • This patch avoids that removing a path controlled by the dm-mpath driver
    while mkfs is running triggers the following kernel bug:

    kernel BUG at block/blk-core.c:3347!
    invalid opcode: 0000 [#1] PREEMPT SMP KASAN
    CPU: 20 PID: 24369 Comm: mkfs.ext4 Not tainted 4.18.0-rc1-dbg+ #2
    RIP: 0010:blk_end_request_all+0x68/0x70
    Call Trace:

    dm_softirq_done+0x326/0x3d0 [dm_mod]
    blk_done_softirq+0x19b/0x1e0
    __do_softirq+0x128/0x60d
    irq_exit+0x100/0x110
    smp_call_function_single_interrupt+0x90/0x330
    call_function_single_interrupt+0xf/0x20

    Fixes: f9d03f96b988 ("block: improve handling of the magic discard payload")
    Reviewed-by: Ming Lei
    Reviewed-by: Christoph Hellwig
    Acked-by: Mike Snitzer
    Signed-off-by: Bart Van Assche
    Cc: Hannes Reinecke
    Cc: Johannes Thumshirn
    Cc:
    Signed-off-by: Jens Axboe

    Bart Van Assche
     

24 Jun, 2018

2 commits

  • Pull block fixes from Jens Axboe:

    - Further timeout fixes. We aren't quite there yet, so expect another
    round of fixes for that to completely close some of the IRQ vs
    completion races. (Christoph/Bart)

    - Set of NVMe fixes from the usual suspects, mostly error handling

    - Two off-by-one fixes (Dan)

    - Another bdi race fix (Jan)

    - Fix nbd reconfigure with NBD_DISCONNECT_ON_CLOSE (Doron)

    * tag 'for-linus-20180623' of git://git.kernel.dk/linux-block:
    blk-mq: Fix timeout handling in case the timeout handler returns BLK_EH_DONE
    bdi: Fix another oops in wb_workfn()
    lightnvm: Remove depends on HAS_DMA in case of platform dependency
    nvme-pci: limit max IO size and segments to avoid high order allocations
    nvme-pci: move nvme_kill_queues to nvme_remove_dead_ctrl
    nvme-fc: release io queues to allow fast fail
    nbd: Add the nbd NBD_DISCONNECT_ON_CLOSE config flag.
    block: sed-opal: Fix a couple off by one bugs
    blk-mq-debugfs: Off by one in blk_mq_rq_state_name()
    nvmet: reset keep alive timer in controller enable
    nvme-rdma: don't override opts->queue_size
    nvme-rdma: Fix command completion race at error recovery
    nvme-rdma: fix possible free of a non-allocated async event buffer
    nvme-rdma: fix possible double free condition when failing to create a controller
    Revert "block: Add warning for bi_next not NULL in bio_endio()"
    block: fix timeout changes for legacy request drivers

    Linus Torvalds
     
  • Make sure that RQF_TIMED_OUT is cleared when a request is reused
    after a block driver timeout handler has returned BLK_EH_DONE.

    Fixes: da6612673988 ("blk-mq: don't time out requests again that are in the timeout handler")
    Signed-off-by: Bart Van Assche
    Cc: Christoph Hellwig
    Cc: Jianchao Wang
    Cc: Andrew Randrianasulu
    Signed-off-by: Jens Axboe

    Bart Van Assche
     

21 Jun, 2018

2 commits

  • resp->num is the number of tokens in resp->tok[]. It gets set in
    response_parse(). So if n == resp->num then we're reading beyond the
    end of the data.

    Fixes: 455a7b238cd6 ("block: Add Sed-opal library")
    Reviewed-by: Scott Bauer
    Tested-by: Scott Bauer
    Signed-off-by: Dan Carpenter
    Signed-off-by: Jens Axboe

    Dan Carpenter
     
  • If rq_state == ARRAY_SIZE() then we read one element beyond the end of
    the blk_mq_rq_state_name_array[] array.

    Fixes: ec6dcf63c55c ("blk-mq-debugfs: Show more request state information")
    Reviewed-by: Bart Van Assche
    Signed-off-by: Dan Carpenter
    Signed-off-by: Jens Axboe

    Dan Carpenter
     

20 Jun, 2018

2 commits

  • Commit 0ba99ca4838b ("block: Add warning for bi_next not NULL in
    bio_endio()") breaks the dm driver. end_clone_bio() detects whether
    or not a bio is the last bio associated with a request by checking
    the .bi_next field. Commit 0ba99ca4838b clears that field before
    end_clone_bio() has had a chance to inspect that field. Hence revert
    commit 0ba99ca4838b.

    This patch avoids that KASAN reports the following complaint when
    running the srp-test software (srp-test/run_tests -c -d -r 10 -t 02-mq):

    ==================================================================
    BUG: KASAN: use-after-free in bio_advance+0x11b/0x1d0
    Read of size 4 at addr ffff8801300e06d0 by task ksoftirqd/0/9

    CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 4.18.0-rc1-dbg+ #1
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
    Call Trace:
    dump_stack+0xa4/0xf5
    print_address_description+0x6f/0x270
    kasan_report+0x241/0x360
    __asan_load4+0x78/0x80
    bio_advance+0x11b/0x1d0
    blk_update_request+0xa7/0x5b0
    scsi_end_request+0x56/0x320 [scsi_mod]
    scsi_io_completion+0x7d6/0xb20 [scsi_mod]
    scsi_finish_command+0x1c0/0x280 [scsi_mod]
    scsi_softirq_done+0x19a/0x230 [scsi_mod]
    blk_mq_complete_request+0x160/0x240
    scsi_mq_done+0x50/0x1a0 [scsi_mod]
    srp_recv_done+0x515/0x1330 [ib_srp]
    __ib_process_cq+0xa0/0xf0 [ib_core]
    ib_poll_handler+0x38/0xa0 [ib_core]
    irq_poll_softirq+0xe8/0x1f0
    __do_softirq+0x128/0x60d
    run_ksoftirqd+0x3f/0x60
    smpboot_thread_fn+0x352/0x460
    kthread+0x1c1/0x1e0
    ret_from_fork+0x24/0x30

    Allocated by task 1918:
    save_stack+0x43/0xd0
    kasan_kmalloc+0xad/0xe0
    kasan_slab_alloc+0x11/0x20
    kmem_cache_alloc+0xfe/0x350
    mempool_alloc_slab+0x15/0x20
    mempool_alloc+0xfb/0x270
    bio_alloc_bioset+0x244/0x350
    submit_bh_wbc+0x9c/0x2f0
    __block_write_full_page+0x299/0x5a0
    block_write_full_page+0x16b/0x180
    blkdev_writepage+0x18/0x20
    __writepage+0x42/0x80
    write_cache_pages+0x376/0x8a0
    generic_writepages+0xbe/0x110
    blkdev_writepages+0xe/0x10
    do_writepages+0x9b/0x180
    __filemap_fdatawrite_range+0x178/0x1c0
    file_write_and_wait_range+0x59/0xc0
    blkdev_fsync+0x46/0x80
    vfs_fsync_range+0x66/0x100
    do_fsync+0x3d/0x70
    __x64_sys_fsync+0x21/0x30
    do_syscall_64+0x77/0x230
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    Freed by task 9:
    save_stack+0x43/0xd0
    __kasan_slab_free+0x137/0x190
    kasan_slab_free+0xe/0x10
    kmem_cache_free+0xd3/0x380
    mempool_free_slab+0x17/0x20
    mempool_free+0x63/0x160
    bio_free+0x81/0xa0
    bio_put+0x59/0x60
    end_bio_bh_io_sync+0x5d/0x70
    bio_endio+0x1a7/0x360
    blk_update_request+0xd0/0x5b0
    end_clone_bio+0xa3/0xd0 [dm_mod]
    bio_endio+0x1a7/0x360
    blk_update_request+0xd0/0x5b0
    scsi_end_request+0x56/0x320 [scsi_mod]
    scsi_io_completion+0x7d6/0xb20 [scsi_mod]
    scsi_finish_command+0x1c0/0x280 [scsi_mod]
    scsi_softirq_done+0x19a/0x230 [scsi_mod]
    blk_mq_complete_request+0x160/0x240
    scsi_mq_done+0x50/0x1a0 [scsi_mod]
    srp_recv_done+0x515/0x1330 [ib_srp]
    __ib_process_cq+0xa0/0xf0 [ib_core]
    ib_poll_handler+0x38/0xa0 [ib_core]
    irq_poll_softirq+0xe8/0x1f0
    __do_softirq+0x128/0x60d

    The buggy address belongs to the object at ffff8801300e0640
    which belongs to the cache bio-0 of size 200
    The buggy address is located 144 bytes inside of
    200-byte region [ffff8801300e0640, ffff8801300e0708)
    The buggy address belongs to the page:
    page:ffffea0004c03800 count:1 mapcount:0 mapping:ffff88015a563a00 index:0x0 compound_mapcount: 0
    flags: 0x8000000000008100(slab|head)
    raw: 8000000000008100 dead000000000100 dead000000000200 ffff88015a563a00
    raw: 0000000000000000 0000000000330033 00000001ffffffff 0000000000000000
    page dumped because: kasan: bad access detected

    Memory state around the buggy address:
    ffff8801300e0580: fb fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc
    ffff8801300e0600: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
    >ffff8801300e0680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ^
    ffff8801300e0700: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    ffff8801300e0780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    ==================================================================

    Cc: Kent Overstreet
    Fixes: 0ba99ca4838b ("block: Add warning for bi_next not NULL in bio_endio()")
    Acked-by: Mike Snitzer
    Signed-off-by: Bart Van Assche
    Signed-off-by: Jens Axboe

    Bart Van Assche
     
  • blk_mq_complete_request can only be called for blk-mq drivers, but when
    removing the BLK_EH_HANDLED return value, two legacy request timeout
    methods incorrectly got switched to call blk_mq_complete_request.
    Call __blk_complete_request instead to reinstance the previous behavior.
    For that __blk_complete_request needs to be exported.

    Fixes: 1fc2b62e ("scsi_transport_fc: complete requests from ->timeout")
    Fixes: 0df0bb08 ("null_blk: complete requests from ->timeout")
    Reported-by: Jianchao Wang
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

17 Jun, 2018

1 commit

  • Pull block fixes from Jens Axboe:
    "A collection of fixes that should go into -rc1. This contains:

    - bsg_open vs bsg_unregister race fix (Anatoliy)

    - NVMe pull request from Christoph, with fixes for regressions in
    this window, FC connect/reconnect path code unification, and a
    trace point addition.

    - timeout fix (Christoph)

    - remove a few unused functions (Christoph)

    - blk-mq tag_set reinit fix (Roman)"

    * tag 'for-linus-20180616' of git://git.kernel.dk/linux-block:
    bsg: fix race of bsg_open and bsg_unregister
    block: remov blk_queue_invalidate_tags
    nvme-fabrics: fix and refine state checks in __nvmf_check_ready
    nvme-fabrics: handle the admin-only case properly in nvmf_check_ready
    nvme-fabrics: refactor queue ready check
    blk-mq: remove blk_mq_tagset_iter
    nvme: remove nvme_reinit_tagset
    nvme-fc: fix nulling of queue data on reconnect
    nvme-fc: remove reinit_request routine
    blk-mq: don't time out requests again that are in the timeout handler
    nvme-fc: change controllers first connect to use reconnect path
    nvme: don't rely on the changed namespace list log
    nvmet: free smart-log buffer after use
    nvme-rdma: fix error flow during mapping request data
    nvme: add bio remapping tracepoint
    nvme: fix NULL pointer dereference in nvme_init_subsystem
    blk-mq: reinit q->tag_set_list entry only after grace period

    Linus Torvalds
     

16 Jun, 2018

1 commit

  • As we move stuff around, some doc references are broken. Fix some of
    them via this script:
    ./scripts/documentation-file-ref-check --fix

    Manually checked if the produced result is valid, removing a few
    false-positives.

    Acked-by: Takashi Iwai
    Acked-by: Masami Hiramatsu
    Acked-by: Stephen Boyd
    Acked-by: Charles Keepax
    Acked-by: Mathieu Poirier
    Reviewed-by: Coly Li
    Signed-off-by: Mauro Carvalho Chehab
    Acked-by: Jonathan Corbet

    Mauro Carvalho Chehab
     

15 Jun, 2018

3 commits

  • The existing implementation allows races between bsg_unregister and
    bsg_open paths. bsg_unregister and request_queue cleanup and deletion
    may start and complete right after bsg_get_device (in bsg_open path)
    retrieves bsg_class_device and releases the mutex. Then bsg_open path
    touches freed memory of bsg_class_device and request_queue.

    One possible fix is to hold the mutex all the way through bsg_get_device
    instead of releasing it after bsg_class_device retrieval.

    Reviewed-by: Christoph Hellwig
    Signed-Off-By: Anatoliy Glagolev
    Signed-off-by: Jens Axboe

    Anatoliy Glagolev
     
  • This function is entirely unused, so remove it and the tag_queue_busy
    member of struct request_queue.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • Pull NVMe fixes from Christoph:

    "Fix various little regressions introduced in this merge window, plus
    a rework of the fibre channel connect and reconnect path to share the
    code instead of having separate sets of bugs. Last but not least a
    trivial trace point addition from Hannes."

    * 'nvme-4.18' of git://git.infradead.org/nvme:
    nvme-fabrics: fix and refine state checks in __nvmf_check_ready
    nvme-fabrics: handle the admin-only case properly in nvmf_check_ready
    nvme-fabrics: refactor queue ready check
    blk-mq: remove blk_mq_tagset_iter
    nvme: remove nvme_reinit_tagset
    nvme-fc: fix nulling of queue data on reconnect
    nvme-fc: remove reinit_request routine
    nvme-fc: change controllers first connect to use reconnect path
    nvme: don't rely on the changed namespace list log
    nvmet: free smart-log buffer after use
    nvme-rdma: fix error flow during mapping request data
    nvme: add bio remapping tracepoint
    nvme: fix NULL pointer dereference in nvme_init_subsystem

    Jens Axboe
     

14 Jun, 2018

2 commits


13 Jun, 2018

5 commits

  • The vzalloc() function has no 2-factor argument form, so multiplication
    factors need to be wrapped in array_size(). This patch replaces cases of:

    vzalloc(a * b)

    with:
    vzalloc(array_size(a, b))

    as well as handling cases of:

    vzalloc(a * b * c)

    with:

    vzalloc(array3_size(a, b, c))

    This does, however, attempt to ignore constant size factors like:

    vzalloc(4 * 1024)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    vzalloc(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    vzalloc(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    vzalloc(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    vzalloc(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    vzalloc(
    - sizeof(TYPE) * (COUNT_ID)
    + array_size(COUNT_ID, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE) * COUNT_ID
    + array_size(COUNT_ID, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE) * (COUNT_CONST)
    + array_size(COUNT_CONST, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE) * COUNT_CONST
    + array_size(COUNT_CONST, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * (COUNT_ID)
    + array_size(COUNT_ID, sizeof(THING))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * COUNT_ID
    + array_size(COUNT_ID, sizeof(THING))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * (COUNT_CONST)
    + array_size(COUNT_CONST, sizeof(THING))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * COUNT_CONST
    + array_size(COUNT_CONST, sizeof(THING))
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    vzalloc(
    - SIZE * COUNT
    + array_size(COUNT, SIZE)
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    vzalloc(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    vzalloc(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    vzalloc(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    vzalloc(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    vzalloc(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    vzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    vzalloc(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    vzalloc(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    vzalloc(C1 * C2 * C3, ...)
    |
    vzalloc(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants.
    @@
    expression E1, E2;
    constant C1, C2;
    @@

    (
    vzalloc(C1 * C2, ...)
    |
    vzalloc(
    - E1 * E2
    + array_size(E1, E2)
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook
     
  • The kvmalloc() function has a 2-factor argument form, kvmalloc_array(). This
    patch replaces cases of:

    kvmalloc(a * b, gfp)

    with:
    kvmalloc_array(a * b, gfp)

    as well as handling cases of:

    kvmalloc(a * b * c, gfp)

    with:

    kvmalloc(array3_size(a, b, c), gfp)

    as it's slightly less ugly than:

    kvmalloc_array(array_size(a, b), c, gfp)

    This does, however, attempt to ignore constant size factors like:

    kvmalloc(4 * 1024, gfp)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    kvmalloc(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    kvmalloc(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    kvmalloc(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    kvmalloc(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    kvmalloc(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    kvmalloc(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    kvmalloc(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    kvmalloc(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    kvmalloc(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    kvmalloc(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    - kvmalloc
    + kvmalloc_array
    (
    - sizeof(TYPE) * (COUNT_ID)
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kvmalloc
    + kvmalloc_array
    (
    - sizeof(TYPE) * COUNT_ID
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kvmalloc
    + kvmalloc_array
    (
    - sizeof(TYPE) * (COUNT_CONST)
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kvmalloc
    + kvmalloc_array
    (
    - sizeof(TYPE) * COUNT_CONST
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kvmalloc
    + kvmalloc_array
    (
    - sizeof(THING) * (COUNT_ID)
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kvmalloc
    + kvmalloc_array
    (
    - sizeof(THING) * COUNT_ID
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kvmalloc
    + kvmalloc_array
    (
    - sizeof(THING) * (COUNT_CONST)
    + COUNT_CONST, sizeof(THING)
    , ...)
    |
    - kvmalloc
    + kvmalloc_array
    (
    - sizeof(THING) * COUNT_CONST
    + COUNT_CONST, sizeof(THING)
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    - kvmalloc
    + kvmalloc_array
    (
    - SIZE * COUNT
    + COUNT, SIZE
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    kvmalloc(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kvmalloc(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kvmalloc(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kvmalloc(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kvmalloc(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kvmalloc(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kvmalloc(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kvmalloc(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    kvmalloc(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kvmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kvmalloc(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kvmalloc(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kvmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    kvmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    kvmalloc(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kvmalloc(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kvmalloc(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kvmalloc(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kvmalloc(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kvmalloc(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kvmalloc(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kvmalloc(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products,
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    kvmalloc(C1 * C2 * C3, ...)
    |
    kvmalloc(
    - (E1) * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kvmalloc(
    - (E1) * (E2) * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kvmalloc(
    - (E1) * (E2) * (E3)
    + array3_size(E1, E2, E3)
    , ...)
    |
    kvmalloc(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants,
    // keeping sizeof() as the second factor argument.
    @@
    expression THING, E1, E2;
    type TYPE;
    constant C1, C2, C3;
    @@

    (
    kvmalloc(sizeof(THING) * C2, ...)
    |
    kvmalloc(sizeof(TYPE) * C2, ...)
    |
    kvmalloc(C1 * C2 * C3, ...)
    |
    kvmalloc(C1 * C2, ...)
    |
    - kvmalloc
    + kvmalloc_array
    (
    - sizeof(TYPE) * (E2)
    + E2, sizeof(TYPE)
    , ...)
    |
    - kvmalloc
    + kvmalloc_array
    (
    - sizeof(TYPE) * E2
    + E2, sizeof(TYPE)
    , ...)
    |
    - kvmalloc
    + kvmalloc_array
    (
    - sizeof(THING) * (E2)
    + E2, sizeof(THING)
    , ...)
    |
    - kvmalloc
    + kvmalloc_array
    (
    - sizeof(THING) * E2
    + E2, sizeof(THING)
    , ...)
    |
    - kvmalloc
    + kvmalloc_array
    (
    - (E1) * E2
    + E1, E2
    , ...)
    |
    - kvmalloc
    + kvmalloc_array
    (
    - (E1) * (E2)
    + E1, E2
    , ...)
    |
    - kvmalloc
    + kvmalloc_array
    (
    - E1 * E2
    + E1, E2
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook
     
  • The kzalloc_node() function has a 2-factor argument form, kcalloc_node(). This
    patch replaces cases of:

    kzalloc_node(a * b, gfp, node)

    with:
    kcalloc_node(a * b, gfp, node)

    as well as handling cases of:

    kzalloc_node(a * b * c, gfp, node)

    with:

    kzalloc_node(array3_size(a, b, c), gfp, node)

    as it's slightly less ugly than:

    kcalloc_node(array_size(a, b), c, gfp, node)

    This does, however, attempt to ignore constant size factors like:

    kzalloc_node(4 * 1024, gfp, node)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    kzalloc_node(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    kzalloc_node(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    kzalloc_node(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc_node(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc_node(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc_node(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc_node(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    kzalloc_node(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    kzalloc_node(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    kzalloc_node(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    - kzalloc_node
    + kcalloc_node
    (
    - sizeof(TYPE) * (COUNT_ID)
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kzalloc_node
    + kcalloc_node
    (
    - sizeof(TYPE) * COUNT_ID
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kzalloc_node
    + kcalloc_node
    (
    - sizeof(TYPE) * (COUNT_CONST)
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kzalloc_node
    + kcalloc_node
    (
    - sizeof(TYPE) * COUNT_CONST
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kzalloc_node
    + kcalloc_node
    (
    - sizeof(THING) * (COUNT_ID)
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kzalloc_node
    + kcalloc_node
    (
    - sizeof(THING) * COUNT_ID
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kzalloc_node
    + kcalloc_node
    (
    - sizeof(THING) * (COUNT_CONST)
    + COUNT_CONST, sizeof(THING)
    , ...)
    |
    - kzalloc_node
    + kcalloc_node
    (
    - sizeof(THING) * COUNT_CONST
    + COUNT_CONST, sizeof(THING)
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    - kzalloc_node
    + kcalloc_node
    (
    - SIZE * COUNT
    + COUNT, SIZE
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    kzalloc_node(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc_node(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc_node(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc_node(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc_node(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc_node(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc_node(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc_node(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    kzalloc_node(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kzalloc_node(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kzalloc_node(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kzalloc_node(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kzalloc_node(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    kzalloc_node(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    kzalloc_node(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc_node(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc_node(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc_node(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc_node(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc_node(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc_node(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc_node(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products,
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    kzalloc_node(C1 * C2 * C3, ...)
    |
    kzalloc_node(
    - (E1) * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc_node(
    - (E1) * (E2) * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc_node(
    - (E1) * (E2) * (E3)
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc_node(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants,
    // keeping sizeof() as the second factor argument.
    @@
    expression THING, E1, E2;
    type TYPE;
    constant C1, C2, C3;
    @@

    (
    kzalloc_node(sizeof(THING) * C2, ...)
    |
    kzalloc_node(sizeof(TYPE) * C2, ...)
    |
    kzalloc_node(C1 * C2 * C3, ...)
    |
    kzalloc_node(C1 * C2, ...)
    |
    - kzalloc_node
    + kcalloc_node
    (
    - sizeof(TYPE) * (E2)
    + E2, sizeof(TYPE)
    , ...)
    |
    - kzalloc_node
    + kcalloc_node
    (
    - sizeof(TYPE) * E2
    + E2, sizeof(TYPE)
    , ...)
    |
    - kzalloc_node
    + kcalloc_node
    (
    - sizeof(THING) * (E2)
    + E2, sizeof(THING)
    , ...)
    |
    - kzalloc_node
    + kcalloc_node
    (
    - sizeof(THING) * E2
    + E2, sizeof(THING)
    , ...)
    |
    - kzalloc_node
    + kcalloc_node
    (
    - (E1) * E2
    + E1, E2
    , ...)
    |
    - kzalloc_node
    + kcalloc_node
    (
    - (E1) * (E2)
    + E1, E2
    , ...)
    |
    - kzalloc_node
    + kcalloc_node
    (
    - E1 * E2
    + E1, E2
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook
     
  • The kzalloc() function has a 2-factor argument form, kcalloc(). This
    patch replaces cases of:

    kzalloc(a * b, gfp)

    with:
    kcalloc(a * b, gfp)

    as well as handling cases of:

    kzalloc(a * b * c, gfp)

    with:

    kzalloc(array3_size(a, b, c), gfp)

    as it's slightly less ugly than:

    kzalloc_array(array_size(a, b), c, gfp)

    This does, however, attempt to ignore constant size factors like:

    kzalloc(4 * 1024, gfp)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    kzalloc(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    kzalloc(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    kzalloc(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (COUNT_ID)
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * COUNT_ID
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (COUNT_CONST)
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * COUNT_CONST
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (COUNT_ID)
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * COUNT_ID
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (COUNT_CONST)
    + COUNT_CONST, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * COUNT_CONST
    + COUNT_CONST, sizeof(THING)
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    - kzalloc
    + kcalloc
    (
    - SIZE * COUNT
    + COUNT, SIZE
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    kzalloc(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    kzalloc(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kzalloc(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    kzalloc(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products,
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    kzalloc(C1 * C2 * C3, ...)
    |
    kzalloc(
    - (E1) * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - (E1) * (E2) * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - (E1) * (E2) * (E3)
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants,
    // keeping sizeof() as the second factor argument.
    @@
    expression THING, E1, E2;
    type TYPE;
    constant C1, C2, C3;
    @@

    (
    kzalloc(sizeof(THING) * C2, ...)
    |
    kzalloc(sizeof(TYPE) * C2, ...)
    |
    kzalloc(C1 * C2 * C3, ...)
    |
    kzalloc(C1 * C2, ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (E2)
    + E2, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * E2
    + E2, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (E2)
    + E2, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * E2
    + E2, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - (E1) * E2
    + E1, E2
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - (E1) * (E2)
    + E1, E2
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - E1 * E2
    + E1, E2
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook
     
  • The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
    patch replaces cases of:

    kmalloc(a * b, gfp)

    with:
    kmalloc_array(a * b, gfp)

    as well as handling cases of:

    kmalloc(a * b * c, gfp)

    with:

    kmalloc(array3_size(a, b, c), gfp)

    as it's slightly less ugly than:

    kmalloc_array(array_size(a, b), c, gfp)

    This does, however, attempt to ignore constant size factors like:

    kmalloc(4 * 1024, gfp)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The tools/ directory was manually excluded, since it has its own
    implementation of kmalloc().

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    kmalloc(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    kmalloc(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    kmalloc(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (COUNT_ID)
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * COUNT_ID
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (COUNT_CONST)
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * COUNT_CONST
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (COUNT_ID)
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * COUNT_ID
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (COUNT_CONST)
    + COUNT_CONST, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * COUNT_CONST
    + COUNT_CONST, sizeof(THING)
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    - kmalloc
    + kmalloc_array
    (
    - SIZE * COUNT
    + COUNT, SIZE
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    kmalloc(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    kmalloc(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kmalloc(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    kmalloc(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products,
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    kmalloc(C1 * C2 * C3, ...)
    |
    kmalloc(
    - (E1) * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - (E1) * (E2) * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - (E1) * (E2) * (E3)
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants,
    // keeping sizeof() as the second factor argument.
    @@
    expression THING, E1, E2;
    type TYPE;
    constant C1, C2, C3;
    @@

    (
    kmalloc(sizeof(THING) * C2, ...)
    |
    kmalloc(sizeof(TYPE) * C2, ...)
    |
    kmalloc(C1 * C2 * C3, ...)
    |
    kmalloc(C1 * C2, ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (E2)
    + E2, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * E2
    + E2, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (E2)
    + E2, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * E2
    + E2, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - (E1) * E2
    + E1, E2
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - (E1) * (E2)
    + E1, E2
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - E1 * E2
    + E1, E2
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook
     

11 Jun, 2018

2 commits

  • It is not allowed to reinit q->tag_set_list list entry while RCU grace
    period has not completed yet, otherwise the following soft lockup in
    blk_mq_sched_restart() happens:

    [ 1064.252652] watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [fio:9270]
    [ 1064.254445] task: ffff99b912e8b900 task.stack: ffffa6d54c758000
    [ 1064.254613] RIP: 0010:blk_mq_sched_restart+0x96/0x150
    [ 1064.256510] Call Trace:
    [ 1064.256664]
    [ 1064.256824] blk_mq_free_request+0xea/0x100
    [ 1064.256987] msg_io_conf+0x59/0xd0 [ibnbd_client]
    [ 1064.257175] complete_rdma_req+0xf2/0x230 [ibtrs_client]
    [ 1064.257340] ? ibtrs_post_recv_empty+0x4d/0x70 [ibtrs_core]
    [ 1064.257502] ibtrs_clt_rdma_done+0xd1/0x1e0 [ibtrs_client]
    [ 1064.257669] ib_create_qp+0x321/0x380 [ib_core]
    [ 1064.257841] ib_process_cq_direct+0xbd/0x120 [ib_core]
    [ 1064.258007] irq_poll_softirq+0xb7/0xe0
    [ 1064.258165] __do_softirq+0x106/0x2a2
    [ 1064.258328] irq_exit+0x92/0xa0
    [ 1064.258509] do_IRQ+0x4a/0xd0
    [ 1064.258660] common_interrupt+0x7a/0x7a
    [ 1064.258818]

    Meanwhile another context frees other queue but with the same set of
    shared tags:

    [ 1288.201183] INFO: task bash:5910 blocked for more than 180 seconds.
    [ 1288.201833] bash D 0 5910 5820 0x00000000
    [ 1288.202016] Call Trace:
    [ 1288.202315] schedule+0x32/0x80
    [ 1288.202462] schedule_timeout+0x1e5/0x380
    [ 1288.203838] wait_for_completion+0xb0/0x120
    [ 1288.204137] __wait_rcu_gp+0x125/0x160
    [ 1288.204287] synchronize_sched+0x6e/0x80
    [ 1288.204770] blk_mq_free_queue+0x74/0xe0
    [ 1288.204922] blk_cleanup_queue+0xc7/0x110
    [ 1288.205073] ibnbd_clt_unmap_device+0x1bc/0x280 [ibnbd_client]
    [ 1288.205389] ibnbd_clt_unmap_dev_store+0x169/0x1f0 [ibnbd_client]
    [ 1288.205548] kernfs_fop_write+0x109/0x180
    [ 1288.206328] vfs_write+0xb3/0x1a0
    [ 1288.206476] SyS_write+0x52/0xc0
    [ 1288.206624] do_syscall_64+0x68/0x1d0
    [ 1288.206774] entry_SYSCALL_64_after_hwframe+0x3d/0xa2

    What happened is the following:

    1. There are several MQ queues with shared tags.
    2. One queue is about to be freed and now task is in
    blk_mq_del_queue_tag_set().
    3. Other CPU is in blk_mq_sched_restart() and loops over all queues in
    tag list in order to find hctx to restart.

    Because linked list entry was modified in blk_mq_del_queue_tag_set()
    without proper waiting for a grace period, blk_mq_sched_restart()
    never ends, spining in list_for_each_entry_rcu_rr(), thus soft lockup.

    Fix is simple: reinit list entry after an RCU grace period elapsed.

    Fixes: Fixes: 705cda97ee3a ("blk-mq: Make it safe to use RCU to iterate over blk_mq_tag_set.tag_list")
    Cc: stable@vger.kernel.org
    Cc: Sagi Grimberg
    Cc: linux-block@vger.kernel.org
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Ming Lei
    Reviewed-by: Bart Van Assche
    Signed-off-by: Roman Pen
    Signed-off-by: Jens Axboe

    Roman Pen
     
  • Pull block flush handling fix from Jens Axboe:
    "Single fix that we should merge now, fixing a regression in queuing
    flush request, accessing request flags after calling the end_request
    handler"

    * tag 'for-linus-20180610' of git://git.kernel.dk/linux-block:
    block: fix use-after-free in block flush handling

    Linus Torvalds
     

09 Jun, 2018

3 commits

  • A recent commit reused the original request flags for the flush
    queue handling. However, for some of the kick flush cases, the
    original request was already completed. This caused a use after
    free, if blk-mq wasn't used.

    Fixes: 84fca1b0c461 ("block: pass failfast and driver-specific flags to flush requests")
    Reported-by: Dmitry Vyukov
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • Pull block fixes from Jens Axboe:
    "A few fixes for this merge window, where some of them should go in
    sooner rather than later, hence a new pull this week. This pull
    request contains:

    - Set of NVMe fixes, mostly follow up cleanups/fixes to the queue
    changes, but also teardown/removal and misc changes (Christop/Dan/
    Johannes/Sagi/Steve).

    - Two lightnvm fixes for issues that showed up in this window
    (Colin/Wei).

    - Failfast/driver flags inheritance for flush requests (Hannes).

    - The md device put sanitization and fix (Kent).

    - dm bio_set inheritance fix (me).

    - nbd discard granularity fix (Josef).

    - nbd consistency in command printing (Kevin).

    - Loop recursion validation fix (Ted).

    - Partition overlap check (Wang)"

    [ .. and now my build is warning-free again thanks to the md fix - Linus ]

    * tag 'for-linus-20180608' of git://git.kernel.dk/linux-block: (22 commits)
    nvme: cleanup double shift issue
    nvme-pci: make CMB SQ mod-param read-only
    nvme-pci: unquiesce dead controller queues
    nvme-pci: remove HMB teardown on reset
    nvme-pci: queue creation fixes
    nvme-pci: remove unnecessary completion doorbell check
    nvme-pci: remove unnecessary nested locking
    nvmet: filter newlines from user input
    nvme-rdma: correctly check for target keyed sgl support
    nvme: don't hold nvmf_transports_rwsem for more than transport lookups
    nvmet: return all zeroed buffer when we can't find an active namespace
    md: Unify mddev destruction paths
    dm: use bioset_init_from_src() to copy bio_set
    block: add bioset_init_from_src() helper
    block: always set partition number to '0' in blk_partition_remap()
    block: pass failfast and driver-specific flags to flush requests
    nbd: set discard_alignment to the granularity
    nbd: Consistently use request pointer in debug messages.
    block: add verifier for cmdline partition
    lightnvm: pblk: fix resource leak of invalid_bitmap
    ...

    Linus Torvalds
     
  • Pull aio iopriority support from Al Viro:
    "The rest of aio stuff for this cycle - Adam's aio ioprio series"

    * 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    fs: aio ioprio use ioprio_check_cap ret val
    fs: aio ioprio add explicit block layer dependence
    fs: iomap dio set bio prio from kiocb prio
    fs: blkdev set bio prio from kiocb prio
    fs: Add aio iopriority support
    fs: Convert kiocb rw_hint from enum to u16
    block: add ioprio_check_cap function

    Linus Torvalds
     

08 Jun, 2018

1 commit


07 Jun, 2018

1 commit

  • blk_partition_remap() will only clear bi_partno if an actual remapping
    has happened. But flush request et al don't have an actual size, so
    the remapping doesn't happen and bi_partno is never cleared.
    So for stacked devices blk_partition_remap() will be called on each level.
    If (as is the case for native nvme multipathing) one of the lower-level
    devices do _not_support partitioning a spurious I/O error is generated.

    Reviewed-by: Johannes Thumshirn
    Reviewed-by: Sagi Grimberg
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Hannes Reinecke
     

06 Jun, 2018

2 commits

  • If flush requests are being sent to the device we need to inherit the
    failfast and driver-specific flags, too, otherwise I/O will fail.

    Reviewed-by: Christoph Hellwig
    Signed-off-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Hannes Reinecke
     
  • Pull xfs updates from Darrick Wong:
    "New features this cycle include the ability to relabel mounted
    filesystems, support for fallocated swapfiles, and using FUA for pure
    data O_DSYNC directio writes. With this cycle we begin to integrate
    online filesystem repair and refactor the growfs code in preparation
    for eventual subvolume support, though the road ahead for both
    features is quite long.

    There are also numerous refactorings of the iomap code to remove
    unnecessary log overhead, to disentangle some of the quota code, and
    to prepare for buffer head removal in a future upstream kernel.

    Metadata validation continues to improve, both in the hot path
    veifiers and the online filesystem check code. I anticipate sending a
    second pull request in a few days with more metadata validation
    improvements.

    This series has been run through a full xfstests run over the weekend
    and through a quick xfstests run against this morning's master, with
    no major failures reported.

    Summary:

    - Strengthen inode number and structure validation when allocating
    inodes.

    - Reduce pointless buffer allocations during cache miss

    - Use FUA for pure data O_DSYNC directio writes

    - Various iomap refactorings

    - Strengthen quota metadata verification to avoid unfixable broken
    quota

    - Make AGFL block freeing a deferred operation to avoid blowing out
    transaction reservations when running complex operations

    - Get rid of the log item descriptors to reduce log overhead

    - Fix various reflink bugs where inodes were double-joined to
    transactions

    - Don't issue discards when trimming unwritten extents

    - Refactor incore dquot initialization and retrieval interfaces

    - Fix some locking problmes in the quota scrub code

    - Strengthen btree structure checks in scrub code

    - Rewrite swapfile activation to use iomap and support unwritten
    extents

    - Make scrub exit to userspace sooner when corruptions or
    cross-referencing problems are found

    - Make scrub invoke the data fork scrubber directly on metadata
    inodes

    - Don't do background reclamation of post-eof and cow blocks when the
    fs is suspended

    - Fix secondary superblock buffer lifespan hinting

    - Refactor growfs to use table-dispatched functions instead of long
    stringy functions

    - Move growfs code to libxfs

    - Implement online fs label getting and setting

    - Introduce online filesystem repair (in a very limited capacity)

    - Fix unit conversion problems in the realtime freemap iteration
    functions

    - Various refactorings and cleanups in preparation to remove buffer
    heads in a future release

    - Reimplement the old bmap call with iomap

    - Remove direct buffer head accesses from seek hole/data

    - Various bug fixes"

    * tag 'xfs-4.18-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (121 commits)
    fs: use ->is_partially_uptodate in page_cache_seek_hole_data
    fs: remove the buffer_unwritten check in page_seek_hole_data
    fs: move page_cache_seek_hole_data to iomap.c
    xfs: use iomap_bmap
    iomap: add an iomap-based bmap implementation
    iomap: add a iomap_sector helper
    iomap: use __bio_add_page in iomap_dio_zero
    iomap: move IOMAP_F_BOUNDARY to gfs2
    iomap: fix the comment describing IOMAP_NOWAIT
    iomap: inline data should be an iomap type, not a flag
    mm: split ->readpages calls to avoid non-contiguous pages lists
    mm: return an unsigned int from __do_page_cache_readahead
    mm: give the 'ret' variable a better name __do_page_cache_readahead
    block: add a lower-level bio_add_page interface
    xfs: fix error handling in xfs_refcount_insert()
    xfs: fix xfs_rtalloc_rec units
    xfs: strengthen rtalloc query range checks
    xfs: xfs_rtbuf_get should check the bmapi_read results
    xfs: xfs_rtword_t should be unsigned, not signed
    dax: change bdev_dax_supported() to support boolean returns
    ...

    Linus Torvalds
     

05 Jun, 2018

3 commits

  • I meet strange filesystem corruption issue recently, the reason
    is there are overlaps partitions in cmdline partition argument.

    This patch add verifier for cmdline partition, then if there are
    overlaps partitions, cmdline_partition will log a warning. We don't
    treat overlaps partition as a error:
    "
    Caizhiyong said:
    Partition overlap was intentionally designed in this cmdline partition.
    reference http://lists.infradead.org/pipermail/linux-mtd/2013-August/048092.html
    "

    Signed-off-by: Wang YanQing
    Signed-off-by: Jens Axboe

    Wang YanQing
     
  • If a hardware queue is stopped, it should not be run again before
    explicitly started. Ignore stopped queues in blk_mq_run_work_fn(),
    fixing a regression recently introduced when the START_ON_RUN bit
    was removed.

    Fixes: 15fe8a90bb45 ("blk-mq: remove blk_mq_delay_queue()")
    Reviewed-by: Ming Lei
    Reviewed-by: Bart Van Assche
    Signed-off-by: Jianchao Wang
    Signed-off-by: Jens Axboe

    Jianchao Wang
     
  • Pull procfs updates from Al Viro:
    "Christoph's proc_create_... cleanups series"

    * 'hch.procfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (44 commits)
    xfs, proc: hide unused xfs procfs helpers
    isdn/gigaset: add back gigaset_procinfo assignment
    proc: update SIZEOF_PDE_INLINE_NAME for the new pde fields
    tty: replace ->proc_fops with ->proc_show
    ide: replace ->proc_fops with ->proc_show
    ide: remove ide_driver_proc_write
    isdn: replace ->proc_fops with ->proc_show
    atm: switch to proc_create_seq_private
    atm: simplify procfs code
    bluetooth: switch to proc_create_seq_data
    netfilter/x_tables: switch to proc_create_seq_private
    netfilter/xt_hashlimit: switch to proc_create_{seq,single}_data
    neigh: switch to proc_create_seq_data
    hostap: switch to proc_create_{seq,single}_data
    bonding: switch to proc_create_seq_data
    rtc/proc: switch to proc_create_single_data
    drbd: switch to proc_create_single
    resource: switch to proc_create_seq_data
    staging/rtl8192u: simplify procfs code
    jfs: simplify procfs code
    ...

    Linus Torvalds
     

03 Jun, 2018

2 commits

  • Now we setup q->nr_requests when switching to one new scheduler,
    but not do it for 'none', then q->nr_requests may not be correct
    for 'none'.

    This patch fixes this issue by always updating 'nr_requests' when
    switching to 'none'.

    Cc: Marco Patalano
    Cc: "Ewan D. Milne"
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     
  • If we end up splitting a bio and the queue goes away between
    the initial submission and the later split submission, then we
    can block forever in blk_queue_enter() waiting for the reference
    to drop to zero. This will never happen, since we already hold
    a reference.

    Mark a split bio as already having entered the queue, so we can
    just use the live non-blocking queue enter variant.

    Thanks to Tetsuo Handa for the analysis.

    Reported-by: syzbot+c4f9cebf9d651f6e54de@syzkaller.appspotmail.com
    Signed-off-by: Jens Axboe

    Jens Axboe
     

02 Jun, 2018

1 commit

  • For the upcoming removal of buffer heads in XFS we need to keep track of
    the number of outstanding writeback requests per page. For this we need
    to know if bio_add_page merged a region with the previous bvec or not.
    Instead of adding additional arguments this refactors bio_add_page to
    be implemented using three lower level helpers which users like XFS can
    use directly if they care about the merge decisions.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Jens Axboe
    Reviewed-by: Ming Lei
    Reviewed-by: Darrick J. Wong
    Signed-off-by: Darrick J. Wong

    Christoph Hellwig
     

01 Jun, 2018

2 commits