20 Sep, 2018

1 commit

  • commit d5274b3cd6a814ccb2f56d81ee87cbbf51bd4cf7 upstream.

    Fix trivial use-after-free. This could be last reference to bfqg.

    Fixes: 8f9bebc33dd7 ("block, bfq: access and cache blkg data only when safe")
    Acked-by: Paolo Valente
    Signed-off-by: Konstantin Khlebnikov
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Konstantin Khlebnikov
     

10 Sep, 2018

1 commit

  • commit fc8ebd01deeb12728c83381f6ec923e4a192ffd3 upstream.

    The value that struct cftype .write() method returns is then directly
    returned to userspace as the value returned by write() syscall, so it
    should be the number of bytes actually written (or consumed) and not zero.

    Returning zero from write() syscall makes programs like /bin/echo or bash
    spin.

    Signed-off-by: Maciej S. Szmigiero
    Fixes: e21b7a0b9887 ("block, bfq: add full hierarchical scheduling and cgroups support")
    Cc: stable@vger.kernel.org
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Maciej S. Szmigiero
     

12 Apr, 2018

1 commit

  • [ Upstream commit 52257ffbfcaf58d247b13fb148e27ed17c33e526 ]

    For each pair [device for which bfq is selected as I/O scheduler,
    group in blkio/io], bfq maintains a corresponding bfq group. Each such
    bfq group contains a set of async queues, with each async queue
    created on demand, i.e., when some I/O request arrives for it. On
    creation, an async queue gets an extra reference, to make sure that
    the queue is not freed as long as its bfq group exists. Accordingly,
    to allow the queue to be freed after the group exited, this extra
    reference must released on group exit.

    The above holds also for a bfq root group, i.e., for the bfq group
    corresponding to the root blkio/io root for a given device. Yet, by
    mistake, the references to the existing async queues of a root group
    are not released when the latter exits. This causes a memory leak when
    the instance of bfq for a given device exits. In a similar vein,
    bfqg_stats_xfer_dead is not executed for a root group.

    This commit fixes bfq_pd_offline so that the latter executes the above
    missing operations for a root group too.

    Reported-by: Holger Hoffstätte
    Reported-by: Guoqing Jiang
    Tested-by: Holger Hoffstätte
    Signed-off-by: Davide Ferrari
    Signed-off-by: Paolo Valente
    Signed-off-by: Jens Axboe
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Paolo Valente
     

02 Sep, 2017

1 commit


08 Jun, 2017

1 commit

  • In blk-cgroup, operations on blkg objects are protected with the
    request_queue lock. This is no more the lock that protects
    I/O-scheduler operations in blk-mq. In fact, the latter are now
    protected with a finer-grained per-scheduler-instance lock. As a
    consequence, although blkg lookups are also rcu-protected, blk-mq I/O
    schedulers may see inconsistent data when they access blkg and
    blkg-related objects. BFQ does access these objects, and does incur
    this problem, in the following case.

    The blkg_lookup performed in bfq_get_queue, being protected (only)
    through rcu, may happen to return the address of a copy of the
    original blkg. If this is the case, then the blkg_get performed in
    bfq_get_queue, to pin down the blkg, is useless: it does not prevent
    blk-cgroup code from destroying both the original blkg and all objects
    directly or indirectly referred by the copy of the blkg. BFQ accesses
    these objects, which typically causes a crash for NULL-pointer
    dereference of memory-protection violation.

    Some additional protection mechanism should be added to blk-cgroup to
    address this issue. In the meantime, this commit provides a quick
    temporary fix for BFQ: cache (when safe) blkg data that might
    disappear right after a blkg_lookup.

    In particular, this commit exploits the following facts to achieve its
    goal without introducing further locks. Destroy operations on a blkg
    invoke, as a first step, hooks of the scheduler associated with the
    blkg. And these hooks are executed with bfqd->lock held for BFQ. As a
    consequence, for any blkg associated with the request queue an
    instance of BFQ is attached to, we are guaranteed that such a blkg is
    not destroyed, and that all the pointers it contains are consistent,
    while that instance is holding its bfqd->lock. A blkg_lookup performed
    with bfqd->lock held then returns a fully consistent blkg, which
    remains consistent until this lock is held. In more detail, this holds
    even if the returned blkg is a copy of the original one.

    Finally, also the object describing a group inside BFQ needs to be
    protected from destruction on the blkg_free of the original blkg
    (which invokes bfq_pd_free). This commit adds private refcounting for
    this object, to let it disappear only after no bfq_queue refers to it
    any longer.

    This commit also removes or updates some stale comments on locking
    issues related to blk-cgroup operations.

    Reported-by: Tomas Konir
    Reported-by: Lee Tibbert
    Reported-by: Marco Piazza
    Signed-off-by: Paolo Valente
    Tested-by: Tomas Konir
    Tested-by: Lee Tibbert
    Tested-by: Marco Piazza
    Signed-off-by: Jens Axboe

    Paolo Valente
     

19 Apr, 2017

1 commit

  • The BFQ I/O scheduler features an optimal fair-queuing
    (proportional-share) scheduling algorithm, enriched with several
    mechanisms to boost throughput and reduce latency for interactive and
    real-time applications. This makes BFQ a large and complex piece of
    code. This commit addresses this issue by splitting BFQ into three
    main, independent components, and by moving each component into a
    separate source file:
    1. Main algorithm: handles the interaction with the kernel, and
    decides which requests to dispatch; it uses the following two further
    components to achieve its goals.
    2. Scheduling engine (Hierarchical B-WF2Q+ scheduling algorithm):
    computes the schedule, using weights and budgets provided by the above
    component.
    3. cgroups support: handles group operations (creation, destruction,
    move, ...).

    Signed-off-by: Paolo Valente
    Signed-off-by: Jens Axboe

    Paolo Valente