08 Oct, 2020

8 commits


15 Sep, 2020

5 commits


01 Jul, 2020

1 commit


29 Jun, 2020

3 commits


30 May, 2020

2 commits


08 Nov, 2019

2 commits


16 Sep, 2019

1 commit

  • Currently rq->data_len will be decreased by partial completion or
    zeroed by completion, so when blk_stat_add() is invoked, data_len
    will be zero and there will never be samples in poll_cb because
    blk_mq_poll_stats_bkt() will return -1 if data_len is zero.

    We could move blk_stat_add() back to __blk_mq_complete_request(),
    but that would make the effort of trying to call ktime_get_ns()
    once in vain. Instead we can reuse throtl_size field, and use
    it for both block stats and block throttle, and adjust the
    logic in blk_mq_poll_stats_bkt() accordingly.

    Fixes: 4bc6339a583c ("block: move blk_stat_add() to __blk_mq_end_request()")
    Tested-by: Pavel Begunkov
    Signed-off-by: Hou Tao
    Signed-off-by: Jens Axboe

    Hou Tao
     

29 Aug, 2019

1 commit


10 Jul, 2019

1 commit

  • After commit 991f61fe7e1d ("Blk-throttle: reduce tail io latency when
    iops limit is enforced") wait time could be zero even if group is
    throttled and cannot issue requests right now. As a result
    throtl_select_dispatch() turns into busy-loop under irq-safe queue
    spinlock.

    Fix is simple: always round up target time to the next throttle slice.

    Fixes: 991f61fe7e1d ("Blk-throttle: reduce tail io latency when iops limit is enforced")
    Signed-off-by: Konstantin Khlebnikov
    Cc: stable@vger.kernel.org # v4.19+
    Signed-off-by: Jens Axboe

    Konstantin Khlebnikov
     

01 Jun, 2019

1 commit

  • Commit e99e88a9d2b0 renamed a function argument without updating the
    corresponding kernel-doc header. Update the kernel-doc header.

    Reviewed-by: Chaitanya Kulkarni
    Reviewed-by: Kees Cook
    Fixes: e99e88a9d2b0 ("treewide: setup_timer() -> timer_setup()") # v4.15.
    Signed-off-by: Bart Van Assche
    Signed-off-by: Jens Axboe

    Bart Van Assche
     

08 Dec, 2018

4 commits

  • bio_issue_init among other things initializes the timestamp for an IO.
    Rather than have this logic handled by policies, this consolidates it to
    be on the init paths (normal, clone, bounce clone).

    Signed-off-by: Dennis Zhou
    Acked-by: Tejun Heo
    Reviewed-by: Liu Bo
    Reviewed-by: Josef Bacik
    Signed-off-by: Jens Axboe

    Dennis Zhou
     
  • Previously, blkg association was handled by controller specific code in
    blk-throttle and blk-iolatency. However, because a blkg represents a
    relationship between a blkcg and a request_queue, it makes sense to keep
    the blkg->q and bio->bi_disk->queue consistent.

    This patch moves association into the bio_set_dev macro(). This should
    cover the majority of cases where the device is set/changed keeping the
    two pointers consistent. Fallback code is added to
    blkcg_bio_issue_check() to catch any missing paths.

    Signed-off-by: Dennis Zhou
    Reviewed-by: Josef Bacik
    Signed-off-by: Jens Axboe

    Dennis Zhou
     
  • There are 3 ways blkg association can happen: association with the
    current css, with the page css (swap), or from the wbc css (writeback).

    This patch handles how association is done for the first case where we
    are associating bsaed on the current css. If there is already a blkg
    associated, the css will be reused and association will be redone as the
    request_queue may have changed.

    Signed-off-by: Dennis Zhou
    Reviewed-by: Josef Bacik
    Acked-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Dennis Zhou
     
  • There are several scenarios where blkg_lookup_create() can fail such as
    the blkcg dying, request_queue is dying, or simply being OOM. Most
    handle this by simply falling back to the q->root_blkg and calling it a
    day.

    This patch implements the notion of closest blkg. During
    blkg_lookup_create(), if it fails to create, return the closest blkg
    found or the q->root_blkg. blkg_try_get_closest() is introduced and used
    during association so a bio is always attached to a blkg.

    Signed-off-by: Dennis Zhou
    Acked-by: Tejun Heo
    Reviewed-by: Josef Bacik
    Signed-off-by: Jens Axboe

    Dennis Zhou
     

16 Nov, 2018

4 commits


02 Nov, 2018

1 commit

  • This reverts a series committed earlier due to null pointer exception
    bug report in [1]. It seems there are edge case interactions that I did
    not consider and will need some time to understand what causes the
    adverse interactions.

    The original series can be found in [2] with a follow up series in [3].

    [1] https://www.spinics.net/lists/cgroups/msg20719.html
    [2] https://lore.kernel.org/lkml/20180911184137.35897-1-dennisszhou@gmail.com/
    [3] https://lore.kernel.org/lkml/20181020185612.51587-1-dennis@kernel.org/

    This reverts the following commits:
    d459d853c2ed, b2c3fa546705, 101246ec02b5, b3b9f24f5fcc, e2b0989954ae,
    f0fcb3ec89f3, c839e7a03f92, bdc2491708c4, 74b7c02a9bc1, 5bf9a1f3b4ef,
    a7b39b4e961c, 07b05bcc3213, 49f4c2dc2b50, 27e6fa996c53

    Signed-off-by: Dennis Zhou
    Signed-off-by: Jens Axboe

    Dennis Zhou
     

22 Sep, 2018

2 commits

  • bio_issue_init among other things initializes the timestamp for an IO.
    Rather than have this logic handled by policies, this consolidates it to
    be on the init paths (normal, clone, bounce clone).

    Signed-off-by: Dennis Zhou
    Acked-by: Tejun Heo
    Reviewed-by: Liu Bo
    Signed-off-by: Jens Axboe

    Dennis Zhou (Facebook)
     
  • Previously, blkg's were only assigned as needed by blk-iolatency and
    blk-throttle. bio->css was also always being associated while blkg was
    being looked up and then thrown away in blkcg_bio_issue_check.

    This patch begins the cleanup of bio->css and bio->bi_blkg by always
    associating a blkg in blkcg_bio_issue_check. This tries to create the
    blkg, but if it is not possible, falls back to using the root_blkg of
    the request_queue. Therefore, a bio will always be associated with a
    blkg. The duplicate association logic is removed from blk-throttle and
    blk-iolatency.

    Signed-off-by: Dennis Zhou
    Acked-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Dennis Zhou (Facebook)
     

21 Sep, 2018

1 commit


01 Sep, 2018

1 commit

  • There is a very small change a bio gets caught up in a really
    unfortunate race between a task migration, cgroup exiting, and itself
    trying to associate with a blkg. This is due to css offlining being
    performed after the css->refcnt is killed which triggers removal of
    blkgs that reach their blkg->refcnt of 0.

    To avoid this, association with a blkg should use tryget and fallback to
    using the root_blkg.

    Fixes: 08e18eab0c579 ("block: add bi_blkg to the bio for cgroups")
    Reviewed-by: Josef Bacik
    Signed-off-by: Dennis Zhou
    Cc: Jiufei Xue
    Cc: Joseph Qi
    Cc: Tejun Heo
    Cc: Josef Bacik
    Cc: Jens Axboe
    Signed-off-by: Jens Axboe

    Dennis Zhou (Facebook)
     

10 Aug, 2018

1 commit

  • When an application's iops has exceeded its cgroup's iops limit, surely it
    is throttled and kernel will set a timer for dispatching, thus IO latency
    includes the delay.

    However, the dispatch delay which is calculated by the limit and the
    elapsed jiffies is suboptimal. As the dispatch delay is only calculated
    once the application's iops is (iops limit + 1), it doesn't need to wait
    any longer than the remaining time of the current slice.

    The difference can be proved by the following fio job and cgroup iops
    setting,
    -----
    $ echo 4 > /mnt/config/nullb/disk1/mbps # limit nullb's bandwidth to 4MB/s for testing.
    $ echo "253:1 riops=100 rbps=max" > /sys/fs/cgroup/unified/cg1/io.max
    $ cat r2.job
    [global]
    name=fio-rand-read
    filename=/dev/nullb1
    rw=randread
    bs=4k
    direct=1
    numjobs=1
    time_based=1
    runtime=60
    group_reporting=1

    [file1]
    size=4G
    ioengine=libaio
    iodepth=1
    rate_iops=50000
    norandommap=1
    thinktime=4ms
    -----

    wo patch:
    file1: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
    fio-3.7-66-gedfc
    Starting 1 process

    read: IOPS=99, BW=400KiB/s (410kB/s)(23.4MiB/60001msec)
    slat (usec): min=10, max=336, avg=27.71, stdev=17.82
    clat (usec): min=2, max=28887, avg=5929.81, stdev=7374.29
    lat (usec): min=24, max=28901, avg=5958.73, stdev=7366.22
    clat percentiles (usec):
    | 1.00th=[ 4], 5.00th=[ 4], 10.00th=[ 4], 20.00th=[ 4],
    | 30.00th=[ 4], 40.00th=[ 4], 50.00th=[ 6], 60.00th=[11731],
    | 70.00th=[11863], 80.00th=[11994], 90.00th=[12911], 95.00th=[22676],
    | 99.00th=[23725], 99.50th=[23987], 99.90th=[23987], 99.95th=[25035],
    | 99.99th=[28967]

    w/ patch:
    file1: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
    fio-3.7-66-gedfc
    Starting 1 process

    read: IOPS=100, BW=400KiB/s (410kB/s)(23.4MiB/60005msec)
    slat (usec): min=10, max=155, avg=23.24, stdev=16.79
    clat (usec): min=2, max=12393, avg=5961.58, stdev=5959.25
    lat (usec): min=23, max=12412, avg=5985.91, stdev=5951.92
    clat percentiles (usec):
    | 1.00th=[ 3], 5.00th=[ 3], 10.00th=[ 4], 20.00th=[ 4],
    | 30.00th=[ 4], 40.00th=[ 5], 50.00th=[ 47], 60.00th=[11863],
    | 70.00th=[11994], 80.00th=[11994], 90.00th=[11994], 95.00th=[11994],
    | 99.00th=[11994], 99.50th=[11994], 99.90th=[12125], 99.95th=[12125],
    | 99.99th=[12387]

    Signed-off-by: Liu Bo

    Signed-off-by: Jens Axboe

    Liu Bo
     

09 Jul, 2018

1 commit

  • Currently io.low uses a bi_cg_private to stash its private data for the
    blkg, however other blkcg policies may want to use this as well. Since
    we can get the private data out of the blkg, move this to bi_blkg in the
    bio and make it generic, then we can use bio_associate_blkg() to attach
    the blkg to the bio.

    Theoretically we could simply replace the bi_css with this since we can
    get to all the same information from the blkg, however you have to
    lookup the blkg, so for example wbc_init_bio() would have to lookup and
    possibly allocate the blkg for the css it was trying to attach to the
    bio. This could be problematic and result in us either not attaching
    the css at all to the bio, or falling back to the root blkcg if we are
    unable to allocate the corresponding blkg.

    So for now do this, and in the future if possible we could just replace
    the bi_css with bi_blkg and update the helpers to do the correct
    translation.

    Signed-off-by: Josef Bacik
    Acked-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Josef Bacik