Eric Lee / smarc-fsl-linux-kernel

07 Sep, 2018

1 commit

d5274b3cd block: bfq: swap puts in bfqg_and_blkg_put ... Browse Code »

Fix trivial use-after-free. This could be last reference to bfqg.

Fixes: 8f9bebc33dd7 ("block, bfq: access and cache blkg data only when safe")
Acked-by: Paolo Valente
Signed-off-by: Konstantin Khlebnikov
Signed-off-by: Jens Axboe

Konstantin Khlebnikov
2018-09-07 01:32:58 +0800

17 Aug, 2018

1 commit

fc8ebd01d block, bfq: return nbytes and not zero from struct cftype .write() method ... Browse Code »

The value that struct cftype .write() method returns is then directly
returned to userspace as the value returned by write() syscall, so it
should be the number of bytes actually written (or consumed) and not zero.

Returning zero from write() syscall makes programs like /bin/echo or bash
spin.

Signed-off-by: Maciej S. Szmigiero
Fixes: e21b7a0b9887 ("block, bfq: add full hierarchical scheduling and cgroups support")
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe

Maciej S. Szmigiero
2018-08-17 03:11:16 +0800

09 May, 2018

1 commit

84c7afceb block: use ktime_get_ns() instead of sched_clock() for cfq and bfq ... Browse Code »

cfq and bfq have some internal fields that use sched_clock() which can
trivially use ktime_get_ns() instead. Their timestamp fields in struct
request can also use ktime_get_ns(), which resolves the 8 year old
comment added by commit 28f4197e5d47 ("block: disable preemption before
using sched_clock()").

Signed-off-by: Omar Sandoval
Signed-off-by: Jens Axboe

Omar Sandoval
2018-05-09 22:33:06 +0800

09 Jan, 2018

1 commit

52257ffbf block, bfq: put async queues for root bfq groups too ... Browse Code »

For each pair [device for which bfq is selected as I/O scheduler,
group in blkio/io], bfq maintains a corresponding bfq group. Each such
bfq group contains a set of async queues, with each async queue
created on demand, i.e., when some I/O request arrives for it. On
creation, an async queue gets an extra reference, to make sure that
the queue is not freed as long as its bfq group exists. Accordingly,
to allow the queue to be freed after the group exited, this extra
reference must released on group exit.

The above holds also for a bfq root group, i.e., for the bfq group
corresponding to the root blkio/io root for a given device. Yet, by
mistake, the references to the existing async queues of a root group
are not released when the latter exits. This causes a memory leak when
the instance of bfq for a given device exits. In a similar vein,
bfqg_stats_xfer_dead is not executed for a root group.

This commit fixes bfq_pd_offline so that the latter executes the above
missing operations for a root group too.

Reported-by: Holger Hoffstätte
Reported-by: Guoqing Jiang
Tested-by: Holger Hoffstätte
Signed-off-by: Davide Ferrari
Signed-off-by: Paolo Valente
Signed-off-by: Jens Axboe

Paolo Valente
2018-01-09 23:45:25 +0800

15 Nov, 2017

1 commit

a33801e8b block, bfq: move debug blkio stats behind CONFIG_DEBUG_BLK_CGROUP ... Browse Code »

BFQ currently creates, and updates, its own instance of the whole
set of blkio statistics that cfq creates. Yet, from the comments
of Tejun Heo in [1], it turned out that most of these statistics
are meant/useful only for debugging. This commit makes BFQ create
the latter, debugging statistics only if the option
CONFIG_DEBUG_BLK_CGROUP is set.

By doing so, this commit also enables BFQ to enjoy a high perfomance
boost. The reason is that, if CONFIG_DEBUG_BLK_CGROUP is not set, then
BFQ has to update far fewer statistics, and, in particular, not the
heaviest to update. To give an idea of the benefits, if
CONFIG_DEBUG_BLK_CGROUP is not set, then, on an Intel i7-4850HQ, and
with 8 threads doing random I/O in parallel on null_blk (configured
with 0 latency), the throughput of BFQ grows from 310 to 400 KIOPS
(+30%). We have measured similar or even much higher boosts with other
CPUs: e.g., +45% with an ARM CortexTM-A53 Octa-core. Our results have
been obtained and can be reproduced very easily with the script in [1].

[1] https://www.spinics.net/lists/linux-block/msg18943.html

Suggested-by: Tejun Heo
Suggested-by: Ulf Hansson
Tested-by: Lee Tibbert
Tested-by: Oleksandr Natalenko
Signed-off-by: Luca Miccio
Signed-off-by: Paolo Valente
Signed-off-by: Jens Axboe

Luca Miccio
2017-11-15 11:13:33 +0800

02 Sep, 2017

1 commit

dfb79af54 bfq: Declare local functions static ... Browse Code »

Acked-by: Paolo Valente
Signed-off-by: Bart Van Assche
Signed-off-by: Jens Axboe

Bart Van Assche
2017-09-02 03:56:37 +0800

08 Jun, 2017

1 commit

8f9bebc33 block, bfq: access and cache blkg data only when safe ... Browse Code »

In blk-cgroup, operations on blkg objects are protected with the
request_queue lock. This is no more the lock that protects
I/O-scheduler operations in blk-mq. In fact, the latter are now
protected with a finer-grained per-scheduler-instance lock. As a
consequence, although blkg lookups are also rcu-protected, blk-mq I/O
schedulers may see inconsistent data when they access blkg and
blkg-related objects. BFQ does access these objects, and does incur
this problem, in the following case.

The blkg_lookup performed in bfq_get_queue, being protected (only)
through rcu, may happen to return the address of a copy of the
original blkg. If this is the case, then the blkg_get performed in
bfq_get_queue, to pin down the blkg, is useless: it does not prevent
blk-cgroup code from destroying both the original blkg and all objects
directly or indirectly referred by the copy of the blkg. BFQ accesses
these objects, which typically causes a crash for NULL-pointer
dereference of memory-protection violation.

Some additional protection mechanism should be added to blk-cgroup to
address this issue. In the meantime, this commit provides a quick
temporary fix for BFQ: cache (when safe) blkg data that might
disappear right after a blkg_lookup.

In particular, this commit exploits the following facts to achieve its
goal without introducing further locks. Destroy operations on a blkg
invoke, as a first step, hooks of the scheduler associated with the
blkg. And these hooks are executed with bfqd->lock held for BFQ. As a
consequence, for any blkg associated with the request queue an
instance of BFQ is attached to, we are guaranteed that such a blkg is
not destroyed, and that all the pointers it contains are consistent,
while that instance is holding its bfqd->lock. A blkg_lookup performed
with bfqd->lock held then returns a fully consistent blkg, which
remains consistent until this lock is held. In more detail, this holds
even if the returned blkg is a copy of the original one.

Finally, also the object describing a group inside BFQ needs to be
protected from destruction on the blkg_free of the original blkg
(which invokes bfq_pd_free). This commit adds private refcounting for
this object, to let it disappear only after no bfq_queue refers to it
any longer.

This commit also removes or updates some stale comments on locking
issues related to blk-cgroup operations.

Reported-by: Tomas Konir
Reported-by: Lee Tibbert
Reported-by: Marco Piazza
Signed-off-by: Paolo Valente
Tested-by: Tomas Konir
Tested-by: Lee Tibbert
Tested-by: Marco Piazza
Signed-off-by: Jens Axboe

Paolo Valente
2017-06-08 23:51:10 +0800

19 Apr, 2017

1 commit

ea25da480 block, bfq: split bfq-iosched.c into multiple source files ... Browse Code »

The BFQ I/O scheduler features an optimal fair-queuing
(proportional-share) scheduling algorithm, enriched with several
mechanisms to boost throughput and reduce latency for interactive and
real-time applications. This makes BFQ a large and complex piece of
code. This commit addresses this issue by splitting BFQ into three
main, independent components, and by moving each component into a
separate source file:
1. Main algorithm: handles the interaction with the kernel, and
decides which requests to dispatch; it uses the following two further
components to achieve its goals.
2. Scheduling engine (Hierarchical B-WF2Q+ scheduling algorithm):
computes the schedule, using weights and budgets provided by the above
component.
3. cgroups support: handles group operations (creation, destruction,
move, ...).

Signed-off-by: Paolo Valente
Signed-off-by: Jens Axboe

Paolo Valente
2017-04-19 22:48:24 +0800