16 Oct, 2019

1 commit

  • rq_qos_del() incorrectly assigns the node being deleted to the head if
    it was the first on the list in the !prev path. Fix it by iterating
    with ** instead.

    Signed-off-by: Tejun Heo
    Cc: Josef Bacik
    Fixes: a79050434b45 ("blk-rq-qos: refactor out common elements of blk-wbt")
    Cc: stable@vger.kernel.org # v4.19+
    Signed-off-by: Jens Axboe

    Tejun Heo
     

06 Oct, 2019

1 commit

  • scale_up wakes up waiters after scaling up. But after scaling max, it
    should not wake up more waiters as waiters will not have anything to
    do. This patch fixes this by making scale_up (and also scale_down)
    return when threshold is reached.

    This bug causes increased fdatasync latency when fdatasync and dd
    conv=sync are performed in parallel on 4.19 compared to 4.14. This
    bug was introduced during refactoring of blk-wbt code.

    Fixes: a79050434b45 ("blk-rq-qos: refactor out common elements of blk-wbt")
    Cc: stable@vger.kernel.org
    Cc: Josef Bacik
    Signed-off-by: Harshad Shirwadkar
    Signed-off-by: Jens Axboe

    Harshad Shirwadkar
     

29 Aug, 2019

4 commits

  • This patchset implements IO cost model based work-conserving
    proportional controller.

    While io.latency provides the capability to comprehensively prioritize
    and protect IOs depending on the cgroups, its protection is binary -
    the lowest latency target cgroup which is suffering is protected at
    the cost of all others. In many use cases including stacking multiple
    workload containers in a single system, it's necessary to distribute
    IO capacity with better granularity.

    One challenge of controlling IO resources is the lack of trivially
    observable cost metric. The most common metrics - bandwidth and iops
    - can be off by orders of magnitude depending on the device type and
    IO pattern. However, the cost isn't a complete mystery. Given
    several key attributes, we can make fairly reliable predictions on how
    expensive a given stream of IOs would be, at least compared to other
    IO patterns.

    The function which determines the cost of a given IO is the IO cost
    model for the device. This controller distributes IO capacity based
    on the costs estimated by such model. The more accurate the cost
    model the better but the controller adapts based on IO completion
    latency and as long as the relative costs across differents IO
    patterns are consistent and sensible, it'll adapt to the actual
    performance of the device.

    Currently, the only implemented cost model is a simple linear one with
    a few sets of default parameters for different classes of device.
    This covers most common devices reasonably well. All the
    infrastructure to tune and add different cost models is already in
    place and a later patch will also allow using bpf progs for cost
    models.

    Please see the top comment in blk-iocost.c and documentation for
    more details.

    v2: Rebased on top of RQ_ALLOC_TIME changes and folded in Rik's fix
    for a divide-by-zero bug in current_hweight() triggered by zero
    inuse_sum.

    Signed-off-by: Tejun Heo
    Cc: Andy Newell
    Cc: Josef Bacik
    Cc: Rik van Riel
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • io.weight is gonna be another rq_qos cgroup mechanism. Let's rename
    RQ_QOS_CGROUP which is being used by io.latency to RQ_QOS_LATENCY in
    preparation.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • wbt already gets queue depth changed notification through
    wbt_set_queue_depth(). Generalize it into
    rq_qos_ops->queue_depth_changed() so that other rq_qos policies can
    easily hook into the events too.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Add a merge hook for rq_qos. This will be used by io.weight.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Tejun Heo
     

01 May, 2019

1 commit


18 Dec, 2018

1 commit

  • The blk-iolatency controller measures the time from rq_qos_throttle() to
    rq_qos_done_bio() and attributes this time to the first bio that needs
    to create the request. This means if a bio is plug-mergeable or
    bio-mergeable, it gets to bypass the blk-iolatency controller.

    The recent series [1], to tag all bios w/ blkgs undermined how iolatency
    was determining which bios it was charging and should process in
    rq_qos_done_bio(). Because all bios are being tagged, this caused the
    atomic_t for the struct rq_wait inflight count to underflow and result
    in a stall.

    This patch adds a new flag BIO_TRACKED to let controllers know that a
    bio is going through the rq_qos path. blk-iolatency now checks if this
    flag is set to see if it should process the bio in rq_qos_done_bio().

    Overloading BLK_QUEUE_ENTERED works, but makes the flag rules confusing.
    BIO_THROTTLED was another candidate, but the flag is set for all bios
    that have gone through blk-throttle code. Overloading a flag comes with
    the burden of making sure that when either implementation changes, a
    change in setting rules for one doesn't cause a bug in the other. So
    here, we unfortunately opt for adding a new flag.

    [1] https://lore.kernel.org/lkml/20181205171039.73066-1-dennis@kernel.org/

    Fixes: 5cdf2e3fea5e ("blkcg: associate blkg when associating a device")
    Signed-off-by: Dennis Zhou
    Cc: Josef Bacik
    Signed-off-by: Jens Axboe

    Dennis Zhou
     

17 Dec, 2018

1 commit

  • blk-mq-debugfs has been proved as very helpful for debug some
    tough issues, such as IO hang.

    We have seen blk-wbt related IO hang several times, even inside
    Red Hat BZ, there is such report not sovled yet, so this patch
    adds support debugfs on rq_qos.

    Cc: Bart Van Assche
    Cc: Omar Sandoval
    Cc: Christoph Hellwig
    Cc: Josef Bacik
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     

08 Dec, 2018

1 commit

  • Originally when I split out the common code from blk-wbt into rq_qos I
    left the wbt_wait() where it was and simply copied and modified it
    slightly to work for io-latency. However they are both basically the
    same thing, and as time has gone on wbt_wait() has ended up much smarter
    and kinder than it was when I copied it into io-latency, which means
    io-latency has lost out on these improvements.

    Since they are the same thing essentially except for a few minor things,
    create rq_qos_wait() that replicates what wbt_wait() currently does with
    callbacks that can be passed in for the snowflakes to do their own thing
    as appropriate.

    Signed-off-by: Josef Bacik
    Signed-off-by: Jens Axboe

    Josef Bacik
     

16 Nov, 2018

2 commits


23 Jul, 2018

1 commit


09 Jul, 2018

3 commits

  • wbt cares only about request completion time, but controllers may need
    information that is on the bio itself, so add a done_bio callback for
    rq-qos so things like blk-iolatency can use it to have the bio when it
    completes.

    Signed-off-by: Josef Bacik
    Signed-off-by: Jens Axboe

    Josef Bacik
     
  • We don't really need to save this stuff in the core block code, we can
    just pass the bio back into the helpers later on to derive the same
    flags and update the rq->wbt_flags appropriately.

    Signed-off-by: Josef Bacik
    Signed-off-by: Jens Axboe

    Josef Bacik
     
  • blkcg-qos is going to do essentially what wbt does, only on a cgroup
    basis. Break out the common code that will be shared between blkcg-qos
    and wbt into blk-rq-qos.* so they can both utilize the same
    infrastructure.

    Signed-off-by: Josef Bacik
    Signed-off-by: Jens Axboe

    Josef Bacik