07 Sep, 2021

1 commit

  • The pending timer has been set up in blk_throtl_init(). However, the
    timer is not deleted in blk_throtl_exit(). This means that the timer
    handler may still be running after freeing the timer, which would
    result in a use-after-free.

    Fix by calling del_timer_sync() to delete the timer in blk_throtl_exit().

    Signed-off-by: Li Jinlin
    Link: https://lore.kernel.org/r/20210907121242.2885564-1-lijinlin3@huawei.com
    Signed-off-by: Jens Axboe

    Li Jinlin
     

15 Aug, 2021

1 commit

  • After patch 54efd50 (block: make generic_make_request handle
    arbitrarily sized bios), the IO through io-throttle may be larger,
    and these IOs may be further split into more small IOs. However,
    IOPS throttle does not seem to be aware of this change, which
    makes the calculation of IOPS of large IOs incomplete, resulting
    in disk-side IOPS that does not meet expectations. Maybe we should
    fix this problem.

    We can reproduce it by set max_sectors_kb of disk to 128, set
    blkio.write_iops_throttle to 100, run a dd instance inside blkio
    and use iostat to watch IOPS:

    dd if=/dev/zero of=/dev/sdb bs=1M count=1000 oflag=direct

    As a result, without this change the average IOPS is 1995, with
    this change the IOPS is 98.

    Signed-off-by: Chunguang Xu
    Acked-by: Tejun Heo
    Link: https://lore.kernel.org/r/65869aaad05475797d63b4c3fed4f529febe3c26.1627876014.git.brookxu@tencent.com
    Signed-off-by: Jens Axboe

    Chunguang Xu
     

25 Jan, 2021

1 commit

  • Replace the gendisk pointer in struct bio with a pointer to the newly
    improved struct block device. From that the gendisk can be trivially
    accessed with an extra indirection, but it also allows to directly
    look up all information related to partition remapping.

    Signed-off-by: Christoph Hellwig
    Acked-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

03 Dec, 2020

1 commit

  • …THROTTLING_LOW is off

    blk_throtl_update_limit_valid() will search for descendants to see if
    'LIMIT_LOW' of bps/iops and READ/WRITE is nonzero. However, they're always
    zero if CONFIG_BLK_DEV_THROTTLING_LOW is not set, furthermore, a lot of
    time will be wasted to iterate descendants.

    Thus do nothing in blk_throtl_update_limit_valid() in such situation.

    Signed-off-by: Yu Kuai <yukuai3@huawei.com>
    Acked-by: Tejun Heo <tj@kernel.org>
    Signed-off-by: Jens Axboe <axboe@kernel.dk>

    Yu Kuai
     

08 Oct, 2020

8 commits


15 Sep, 2020

5 commits


01 Jul, 2020

1 commit


29 Jun, 2020

3 commits


30 May, 2020

2 commits


08 Nov, 2019

2 commits


16 Sep, 2019

1 commit

  • Currently rq->data_len will be decreased by partial completion or
    zeroed by completion, so when blk_stat_add() is invoked, data_len
    will be zero and there will never be samples in poll_cb because
    blk_mq_poll_stats_bkt() will return -1 if data_len is zero.

    We could move blk_stat_add() back to __blk_mq_complete_request(),
    but that would make the effort of trying to call ktime_get_ns()
    once in vain. Instead we can reuse throtl_size field, and use
    it for both block stats and block throttle, and adjust the
    logic in blk_mq_poll_stats_bkt() accordingly.

    Fixes: 4bc6339a583c ("block: move blk_stat_add() to __blk_mq_end_request()")
    Tested-by: Pavel Begunkov
    Signed-off-by: Hou Tao
    Signed-off-by: Jens Axboe

    Hou Tao
     

29 Aug, 2019

1 commit


10 Jul, 2019

1 commit

  • After commit 991f61fe7e1d ("Blk-throttle: reduce tail io latency when
    iops limit is enforced") wait time could be zero even if group is
    throttled and cannot issue requests right now. As a result
    throtl_select_dispatch() turns into busy-loop under irq-safe queue
    spinlock.

    Fix is simple: always round up target time to the next throttle slice.

    Fixes: 991f61fe7e1d ("Blk-throttle: reduce tail io latency when iops limit is enforced")
    Signed-off-by: Konstantin Khlebnikov
    Cc: stable@vger.kernel.org # v4.19+
    Signed-off-by: Jens Axboe

    Konstantin Khlebnikov
     

01 Jun, 2019

1 commit

  • Commit e99e88a9d2b0 renamed a function argument without updating the
    corresponding kernel-doc header. Update the kernel-doc header.

    Reviewed-by: Chaitanya Kulkarni
    Reviewed-by: Kees Cook
    Fixes: e99e88a9d2b0 ("treewide: setup_timer() -> timer_setup()") # v4.15.
    Signed-off-by: Bart Van Assche
    Signed-off-by: Jens Axboe

    Bart Van Assche
     

08 Dec, 2018

4 commits

  • bio_issue_init among other things initializes the timestamp for an IO.
    Rather than have this logic handled by policies, this consolidates it to
    be on the init paths (normal, clone, bounce clone).

    Signed-off-by: Dennis Zhou
    Acked-by: Tejun Heo
    Reviewed-by: Liu Bo
    Reviewed-by: Josef Bacik
    Signed-off-by: Jens Axboe

    Dennis Zhou
     
  • Previously, blkg association was handled by controller specific code in
    blk-throttle and blk-iolatency. However, because a blkg represents a
    relationship between a blkcg and a request_queue, it makes sense to keep
    the blkg->q and bio->bi_disk->queue consistent.

    This patch moves association into the bio_set_dev macro(). This should
    cover the majority of cases where the device is set/changed keeping the
    two pointers consistent. Fallback code is added to
    blkcg_bio_issue_check() to catch any missing paths.

    Signed-off-by: Dennis Zhou
    Reviewed-by: Josef Bacik
    Signed-off-by: Jens Axboe

    Dennis Zhou
     
  • There are 3 ways blkg association can happen: association with the
    current css, with the page css (swap), or from the wbc css (writeback).

    This patch handles how association is done for the first case where we
    are associating bsaed on the current css. If there is already a blkg
    associated, the css will be reused and association will be redone as the
    request_queue may have changed.

    Signed-off-by: Dennis Zhou
    Reviewed-by: Josef Bacik
    Acked-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Dennis Zhou
     
  • There are several scenarios where blkg_lookup_create() can fail such as
    the blkcg dying, request_queue is dying, or simply being OOM. Most
    handle this by simply falling back to the q->root_blkg and calling it a
    day.

    This patch implements the notion of closest blkg. During
    blkg_lookup_create(), if it fails to create, return the closest blkg
    found or the q->root_blkg. blkg_try_get_closest() is introduced and used
    during association so a bio is always attached to a blkg.

    Signed-off-by: Dennis Zhou
    Acked-by: Tejun Heo
    Reviewed-by: Josef Bacik
    Signed-off-by: Jens Axboe

    Dennis Zhou
     

16 Nov, 2018

4 commits


02 Nov, 2018

1 commit

  • This reverts a series committed earlier due to null pointer exception
    bug report in [1]. It seems there are edge case interactions that I did
    not consider and will need some time to understand what causes the
    adverse interactions.

    The original series can be found in [2] with a follow up series in [3].

    [1] https://www.spinics.net/lists/cgroups/msg20719.html
    [2] https://lore.kernel.org/lkml/20180911184137.35897-1-dennisszhou@gmail.com/
    [3] https://lore.kernel.org/lkml/20181020185612.51587-1-dennis@kernel.org/

    This reverts the following commits:
    d459d853c2ed, b2c3fa546705, 101246ec02b5, b3b9f24f5fcc, e2b0989954ae,
    f0fcb3ec89f3, c839e7a03f92, bdc2491708c4, 74b7c02a9bc1, 5bf9a1f3b4ef,
    a7b39b4e961c, 07b05bcc3213, 49f4c2dc2b50, 27e6fa996c53

    Signed-off-by: Dennis Zhou
    Signed-off-by: Jens Axboe

    Dennis Zhou
     

22 Sep, 2018

2 commits

  • bio_issue_init among other things initializes the timestamp for an IO.
    Rather than have this logic handled by policies, this consolidates it to
    be on the init paths (normal, clone, bounce clone).

    Signed-off-by: Dennis Zhou
    Acked-by: Tejun Heo
    Reviewed-by: Liu Bo
    Signed-off-by: Jens Axboe

    Dennis Zhou (Facebook)
     
  • Previously, blkg's were only assigned as needed by blk-iolatency and
    blk-throttle. bio->css was also always being associated while blkg was
    being looked up and then thrown away in blkcg_bio_issue_check.

    This patch begins the cleanup of bio->css and bio->bi_blkg by always
    associating a blkg in blkcg_bio_issue_check. This tries to create the
    blkg, but if it is not possible, falls back to using the root_blkg of
    the request_queue. Therefore, a bio will always be associated with a
    blkg. The duplicate association logic is removed from blk-throttle and
    blk-iolatency.

    Signed-off-by: Dennis Zhou
    Acked-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Dennis Zhou (Facebook)