Eric Lee / smarc-fsl-linux-kernel

24 Aug, 2020

1 commit

df561f668 treewide: Use fallthrough pseudo-keyword ... Browse Code »

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva

Gustavo A. R. Silva
2020-08-24 06:36:59 +0800

30 May, 2020

2 commits

4d89e1d11 blk-wbt: rename __wbt_update_limits to wbt_update_limits ... Browse Code »

Now let's rename __wbt_update_limits to wbt_update_limits after the
previous one is deleted.

Signed-off-by: Guoqing Jiang
Signed-off-by: Jens Axboe

Guoqing Jiang
2020-05-30 06:30:39 +0800
26e0ca12e blk-wbt: remove wbt_update_limits ... Browse Code »

No one call this function after commit 2af2783f2ea4f ("rq-qos: get rid of
redundant wbt_update_limits()"), so remove it.

Signed-off-by: Guoqing Jiang
Signed-off-by: Jens Axboe

Guoqing Jiang
2020-05-30 06:30:39 +0800

17 Apr, 2020

1 commit

3a89c25d9 blk-wbt: Use tracepoint_string() for wbt_step tracepoint string literals ... Browse Code »

Use tracepoint_string() for string literals that are used in the
wbt_step tracepoint, so that userspace tools can display the string
content.

Signed-off-by: Tommi Rantala
Signed-off-by: Jens Axboe

Tommi Rantala
2020-04-17 22:21:44 +0800

06 Oct, 2019

1 commit

b84477d3e blk-wbt: fix performance regression in wbt scale_up/scale_down ... Browse Code »

scale_up wakes up waiters after scaling up. But after scaling max, it
should not wake up more waiters as waiters will not have anything to
do. This patch fixes this by making scale_up (and also scale_down)
return when threshold is reached.

This bug causes increased fdatasync latency when fdatasync and dd
conv=sync are performed in parallel on 4.19 compared to 4.14. This
bug was introduced during refactoring of blk-wbt code.

Fixes: a79050434b45 ("blk-rq-qos: refactor out common elements of blk-wbt")
Cc: stable@vger.kernel.org
Cc: Josef Bacik
Signed-off-by: Harshad Shirwadkar
Signed-off-by: Jens Axboe

Harshad Shirwadkar
2019-10-06 23:26:41 +0800

29 Aug, 2019

1 commit

9677a3e01 block/rq_qos: implement rq_qos_ops->queue_depth_changed() ... Browse Code »

wbt already gets queue depth changed notification through
wbt_set_queue_depth(). Generalize it into
rq_qos_ops->queue_depth_changed() so that other rq_qos policies can
easily hook into the events too.

Signed-off-by: Tejun Heo
Signed-off-by: Jens Axboe

Tejun Heo
2019-08-29 11:17:07 +0800

28 Aug, 2019

1 commit

58c898ba3 block: add helper for checking if queue is registered ... Browse Code »

There are 4 users which check if queue is registered, so add one helper
to check it.

Cc: Christoph Hellwig
Cc: Hannes Reinecke
Cc: Greg KH
Cc: Mike Snitzer
Cc: Bart Van Assche
Reviewed-by: Bart Van Assche
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2019-08-28 00:40:20 +0800

01 May, 2019

1 commit

3dcf60bcb block: add SPDX tags to block layer files missing licensing information ... Browse Code »

Various block layer files do not have any licensing information at all.
Add SPDX tags for the default kernel GPLv2 license to those.

Reviewed-by: Chaitanya Kulkarni
Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2019-05-01 06:12:03 +0800

25 Jan, 2019

1 commit

c83f536a8 blk-wbt: Declare local functions static ... Browse Code »

This patch avoids that sparse reports the following warnings:

CHECK block/blk-wbt.c
block/blk-wbt.c:600:6: warning: symbol 'wbt_issue' was not declared. Should it be static?
block/blk-wbt.c:620:6: warning: symbol 'wbt_requeue' was not declared. Should it be static?
CC block/blk-wbt.o
block/blk-wbt.c:600:6: warning: no previous prototype for wbt_issue [-Wmissing-prototypes]
void wbt_issue(struct rq_qos *rqos, struct request *rq)
^~~~~~~~~
block/blk-wbt.c:620:6: warning: no previous prototype for wbt_requeue [-Wmissing-prototypes]
void wbt_requeue(struct rq_qos *rqos, struct request *rq)
^~~~~~~~~~~

Reviewed-by: Chaitanya Kulkarni
Signed-off-by: Bart Van Assche
Signed-off-by: Jens Axboe

Bart Van Assche
2019-01-25 02:09:21 +0800

17 Dec, 2018

1 commit

d19afebca blk-wbt: export internal state via debugfs ... Browse Code »

This information is helpful to either investigate issues, or understand
wbt's internal behaviour.

Cc: Bart Van Assche
Cc: Omar Sandoval
Cc: Christoph Hellwig
Cc: Josef Bacik
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2018-12-17 10:53:49 +0800

12 Dec, 2018

1 commit

544fbd16a block: deactivate blk_stat timer in wbt_disable_default() ... Browse Code »

rwb_enabled() can't be changed when there is any inflight IO.

wbt_disable_default() may set rwb->wb_normal as zero, however the
blk_stat timer may still be pending, and the timer function will update
wrb->wb_normal again.

This patch introduces blk_stat_deactivate() and applies it in
wbt_disable_default(), then the following IO hang triggered when running
parted & switching io scheduler can be fixed:

[ 369.937806] INFO: task parted:3645 blocked for more than 120 seconds.
[ 369.938941] Not tainted 4.20.0-rc6-00284-g906c801e5248 #498
[ 369.939797] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 369.940768] parted D 0 3645 3239 0x00000000
[ 369.941500] Call Trace:
[ 369.941874] ? __schedule+0x6d9/0x74c
[ 369.942392] ? wbt_done+0x5e/0x5e
[ 369.942864] ? wbt_cleanup_cb+0x16/0x16
[ 369.943404] ? wbt_done+0x5e/0x5e
[ 369.943874] schedule+0x67/0x78
[ 369.944298] io_schedule+0x12/0x33
[ 369.944771] rq_qos_wait+0xb5/0x119
[ 369.945193] ? karma_partition+0x1c2/0x1c2
[ 369.945691] ? wbt_cleanup_cb+0x16/0x16
[ 369.946151] wbt_wait+0x85/0xb6
[ 369.946540] __rq_qos_throttle+0x23/0x2f
[ 369.947014] blk_mq_make_request+0xe6/0x40a
[ 369.947518] generic_make_request+0x192/0x2fe
[ 369.948042] ? submit_bio+0x103/0x11f
[ 369.948486] ? __radix_tree_lookup+0x35/0xb5
[ 369.949011] submit_bio+0x103/0x11f
[ 369.949436] ? blkg_lookup_slowpath+0x25/0x44
[ 369.949962] submit_bio_wait+0x53/0x7f
[ 369.950469] blkdev_issue_flush+0x8a/0xae
[ 369.951032] blkdev_fsync+0x2f/0x3a
[ 369.951502] do_fsync+0x2e/0x47
[ 369.951887] __x64_sys_fsync+0x10/0x13
[ 369.952374] do_syscall_64+0x89/0x149
[ 369.952819] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 369.953492] RIP: 0033:0x7f95a1e729d4
[ 369.953996] Code: Bad RIP value.
[ 369.954456] RSP: 002b:00007ffdb570dd48 EFLAGS: 00000246 ORIG_RAX: 000000000000004a
[ 369.955506] RAX: ffffffffffffffda RBX: 000055c2139c6be0 RCX: 00007f95a1e729d4
[ 369.956389] RDX: 0000000000000001 RSI: 0000000000001261 RDI: 0000000000000004
[ 369.957325] RBP: 0000000000000002 R08: 0000000000000000 R09: 000055c2139c6ce0
[ 369.958199] R10: 0000000000000000 R11: 0000000000000246 R12: 000055c2139c0380
[ 369.959143] R13: 0000000000000004 R14: 0000000000000100 R15: 0000000000000008

Cc: stable@vger.kernel.org
Cc: Paolo Valente
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2018-12-12 21:47:51 +0800

08 Dec, 2018

1 commit

b6c7b58f5 block: convert wbt_wait() to use rq_qos_wait() ... Browse Code »

Now that we have rq_qos_wait() in place, convert wbt_wait() over to
using it with it's specific callbacks.

Signed-off-by: Josef Bacik
Signed-off-by: Jens Axboe

Josef Bacik
2018-12-08 13:26:38 +0800

16 Nov, 2018

4 commits

344e9ffcb block: add queue_is_mq() helper ... Browse Code »

Various spots check for q->mq_ops being non-NULL, but provide
a helper to do this instead.

Where the ->mq_ops != NULL check is redundant, remove it.

Since mq == rq-based now that legacy is gone, get rid of the
queue_is_rq_based() and just use queue_is_mq() everywhere.

Reviewed-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Jens Axboe
2018-11-16 23:34:06 +0800
e815f404a block: add wbt_disable_default export for BFQ ... Browse Code »

This isn't unused, if BFQ is modular we get into trouble.

Fixes: b6676f653f13 ("block: remove a few unused exports")
Signed-off-by: Jens Axboe

Jens Axboe
2018-11-16 03:31:27 +0800
b6676f653 block: remove a few unused exports ... Browse Code »

Reviewed-by: Hannes Reinecke
Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2018-11-16 03:13:25 +0800
d53375608 block: remove the unused lock argument to rq_qos_throttle ... Browse Code »

Unused now that the legacy request path is gone.

Reviewed-by: Hannes Reinecke
Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2018-11-16 03:13:22 +0800

08 Nov, 2018

1 commit

3c7741567 blk-wbt: kill check for legacy queue type ... Browse Code »

Everything is blk-mq at this point, so it doesn't make any sense
to have this option available as it does nothing.

Reviewed-by: Hannes Reinecke
Tested-by: Ming Lei
Reviewed-by: Omar Sandoval
Signed-off-by: Jens Axboe

Jens Axboe
2018-11-08 04:42:32 +0800

12 Oct, 2018

1 commit

5e65a2034 blk-wbt: wake up all when we scale up, not down ... Browse Code »

Tetsuo brought to my attention that I screwed up the scale_up/scale_down
helpers when I factored out the rq-qos code. We need to wake up all the
waiters when we add slots for requests to make, not when we shrink the
slots. Otherwise we'll end up things waiting forever. This was a
mistake and simply puts everything back the way it was.

cc: stable@vger.kernel.org
Fixes: a79050434b45 ("blk-rq-qos: refactor out common elements of blk-wbt")
eported-by: Tetsuo Handa
Signed-off-by: Josef Bacik
Signed-off-by: Jens Axboe

Josef Bacik
2018-10-12 03:31:28 +0800

28 Aug, 2018

3 commits

b0a84beb2 blk-wbt: remove dead code ... Browse Code »

We already note and mark discard and swap IO from bio_to_wbt_flags().

Signed-off-by: Jens Axboe

Jens Axboe
2018-08-28 03:32:12 +0800
38cfb5a45 blk-wbt: improve waking of tasks ... Browse Code »

We have two potential issues:

1) After commit 2887e41b910b, we only wake one process at the time when
we finish an IO. We really want to wake up as many tasks as can
queue IO. Before this commit, we woke up everyone, which could cause
a thundering herd issue.

2) A task can potentially consume two wakeups, causing us to (in
practice) miss a wakeup.

Fix both by providing our own wakeup function, which stops
__wake_up_common() from waking up more tasks if we fail to get a
queueing token. With the strict ordering we have on the wait list, this
wakes the right tasks and the right amount of tasks.

Based on a patch from Jianchao Wang .

Tested-by: Agarwal, Anchal
Signed-off-by: Jens Axboe

Jens Axboe
2018-08-28 01:27:26 +0800
061a54275 blk-wbt: abstract out end IO completion handler ... Browse Code »

Prep patch for calling the handler from a different context,
no functional changes in this patch.

Tested-by: Agarwal, Anchal
Signed-off-by: Jens Axboe

Jens Axboe
2018-08-28 01:27:24 +0800

23 Aug, 2018

4 commits

c125311d9 blk-wbt: don't maintain inflight counts if disabled ... Browse Code »

A previous commit removed the ability to have per-rq flags. We used
those flags to maintain inflight counts. Since we don't have those
anymore, we have to always maintain inflight counts, even if wbt is
disabled. This is clearly suboptimal.

Add a queue quiesce around changing the wbt latency settings from sysfs
to work around this. With that, we can reliably put the enabled check in
our bio_to_wbt_flags(), since we know the WBT_TRACKED flag will be
consistent for the lifetime of the request.

Fixes: c1c80384c8f ("block: remove external dependency on wbt_flags")
Reviewed-by: Josef Bacik
Signed-off-by: Jens Axboe

Jens Axboe
2018-08-23 23:34:46 +0800
c45e6a037 blk-wbt: fix has-sleeper queueing check ... Browse Code »

We need to do this inside the loop as well, or we can allow new
IO to supersede previous IO.

Tested-by: Anchal Agarwal
Signed-off-by: Jens Axboe

Jens Axboe
2018-08-23 05:07:32 +0800
b78820937 blk-wbt: use wq_has_sleeper() for wq active check ... Browse Code »

We need the memory barrier before checking the list head,
use the appropriate helper for this. The matching queue
side memory barrier is provided by set_current_state().

Tested-by: Anchal Agarwal
Signed-off-by: Jens Axboe

Jens Axboe
2018-08-23 05:07:31 +0800
ffa358dca blk-wbt: move disable check into get_limit() ... Browse Code »

Check it in one place, instead of in multiple places.

Tested-by: Anchal Agarwal
Signed-off-by: Jens Axboe

Jens Axboe
2018-08-23 05:07:31 +0800

15 Aug, 2018

1 commit

df60f6e83 blk-wbt: fix IO hang in wbt_wait() ... Browse Code »

On wbt invariant is that if one IO is tracked via WBT_TRACKED, rqw->inflight
should be updated for tracking this IO.

But commit c1c80384c8f ("block: remove external dependency on wbt_flags")
forgets to remove the early handling of !rwb_enabled(rwb) inside wbt_wait(),
then the inflight counter may not be increased in wbt_wait(), but decreased
in wbt_done() for this kind of IO, so this counter may become negative, then
wbt_wait() may wait forever.

This patch fixes the report in the following link:

https://marc.info/?l=linux-block&m=153221542021033&w=2

Fixes: c1c80384c8f ("block: remove external dependency on wbt_flags")
Cc: Josef Bacik
Reported-by: Ming Lei
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2018-08-15 01:05:52 +0800

08 Aug, 2018

1 commit

2887e41b9 blk-wbt: Avoid lock contention and thundering herd issue in wbt_wait ... Browse Code »

I am currently running a large bare metal instance (i3.metal)
on EC2 with 72 cores, 512GB of RAM and NVME drives, with a
4.18 kernel. I have a workload that simulates a database
workload and I am running into lockup issues when writeback
throttling is enabled,with the hung task detector also
kicking in.

Crash dumps show that most CPUs (up to 50 of them) are
all trying to get the wbt wait queue lock while trying to add
themselves to it in __wbt_wait (see stack traces below).

[ 0.948118] CPU: 45 PID: 0 Comm: swapper/45 Not tainted 4.14.51-62.38.amzn1.x86_64 #1
[ 0.948119] Hardware name: Amazon EC2 i3.metal/Not Specified, BIOS 1.0 10/16/2017
[ 0.948120] task: ffff883f7878c000 task.stack: ffffc9000c69c000
[ 0.948124] RIP: 0010:native_queued_spin_lock_slowpath+0xf8/0x1a0
[ 0.948125] RSP: 0018:ffff883f7fcc3dc8 EFLAGS: 00000046
[ 0.948126] RAX: 0000000000000000 RBX: ffff887f7709ca68 RCX: ffff883f7fce2a00
[ 0.948128] RDX: 000000000000001c RSI: 0000000000740001 RDI: ffff887f7709ca68
[ 0.948129] RBP: 0000000000000002 R08: 0000000000b80000 R09: 0000000000000000
[ 0.948130] R10: ffff883f7fcc3d78 R11: 000000000de27121 R12: 0000000000000002
[ 0.948131] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000
[ 0.948132] FS: 0000000000000000(0000) GS:ffff883f7fcc0000(0000) knlGS:0000000000000000
[ 0.948134] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.948135] CR2: 000000c424c77000 CR3: 0000000002010005 CR4: 00000000003606e0
[ 0.948136] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.948137] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 0.948138] Call Trace:
[ 0.948139]
[ 0.948142] do_raw_spin_lock+0xad/0xc0
[ 0.948145] _raw_spin_lock_irqsave+0x44/0x4b
[ 0.948149] ? __wake_up_common_lock+0x53/0x90
[ 0.948150] __wake_up_common_lock+0x53/0x90
[ 0.948155] wbt_done+0x7b/0xa0
[ 0.948158] blk_mq_free_request+0xb7/0x110
[ 0.948161] __blk_mq_complete_request+0xcb/0x140
[ 0.948166] nvme_process_cq+0xce/0x1a0 [nvme]
[ 0.948169] nvme_irq+0x23/0x50 [nvme]
[ 0.948173] __handle_irq_event_percpu+0x46/0x300
[ 0.948176] handle_irq_event_percpu+0x20/0x50
[ 0.948179] handle_irq_event+0x34/0x60
[ 0.948181] handle_edge_irq+0x77/0x190
[ 0.948185] handle_irq+0xaf/0x120
[ 0.948188] do_IRQ+0x53/0x110
[ 0.948191] common_interrupt+0x87/0x87
[ 0.948192]
....
[ 0.311136] CPU: 4 PID: 9737 Comm: run_linux_amd64 Not tainted 4.14.51-62.38.amzn1.x86_64 #1
[ 0.311137] Hardware name: Amazon EC2 i3.metal/Not Specified, BIOS 1.0 10/16/2017
[ 0.311138] task: ffff883f6e6a8000 task.stack: ffffc9000f1ec000
[ 0.311141] RIP: 0010:native_queued_spin_lock_slowpath+0xf5/0x1a0
[ 0.311142] RSP: 0018:ffffc9000f1efa28 EFLAGS: 00000046
[ 0.311144] RAX: 0000000000000000 RBX: ffff887f7709ca68 RCX: ffff883f7f722a00
[ 0.311145] RDX: 0000000000000035 RSI: 0000000000d80001 RDI: ffff887f7709ca68
[ 0.311146] RBP: 0000000000000202 R08: 0000000000140000 R09: 0000000000000000
[ 0.311147] R10: ffffc9000f1ef9d8 R11: 000000001a249fa0 R12: ffff887f7709ca68
[ 0.311148] R13: ffffc9000f1efad0 R14: 0000000000000000 R15: ffff887f7709ca00
[ 0.311149] FS: 000000c423f30090(0000) GS:ffff883f7f700000(0000) knlGS:0000000000000000
[ 0.311150] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.311151] CR2: 00007feefcea4000 CR3: 0000007f7016e001 CR4: 00000000003606e0
[ 0.311152] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.311153] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 0.311154] Call Trace:
[ 0.311157] do_raw_spin_lock+0xad/0xc0
[ 0.311160] _raw_spin_lock_irqsave+0x44/0x4b
[ 0.311162] ? prepare_to_wait_exclusive+0x28/0xb0
[ 0.311164] prepare_to_wait_exclusive+0x28/0xb0
[ 0.311167] wbt_wait+0x127/0x330
[ 0.311169] ? finish_wait+0x80/0x80
[ 0.311172] ? generic_make_request+0xda/0x3b0
[ 0.311174] blk_mq_make_request+0xd6/0x7b0
[ 0.311176] ? blk_queue_enter+0x24/0x260
[ 0.311178] ? generic_make_request+0xda/0x3b0
[ 0.311181] generic_make_request+0x10c/0x3b0
[ 0.311183] ? submit_bio+0x5c/0x110
[ 0.311185] submit_bio+0x5c/0x110
[ 0.311197] ? __ext4_journal_stop+0x36/0xa0 [ext4]
[ 0.311210] ext4_io_submit+0x48/0x60 [ext4]
[ 0.311222] ext4_writepages+0x810/0x11f0 [ext4]
[ 0.311229] ? do_writepages+0x3c/0xd0
[ 0.311239] ? ext4_mark_inode_dirty+0x260/0x260 [ext4]
[ 0.311240] do_writepages+0x3c/0xd0
[ 0.311243] ? _raw_spin_unlock+0x24/0x30
[ 0.311245] ? wbc_attach_and_unlock_inode+0x165/0x280
[ 0.311248] ? __filemap_fdatawrite_range+0xa3/0xe0
[ 0.311250] __filemap_fdatawrite_range+0xa3/0xe0
[ 0.311253] file_write_and_wait_range+0x34/0x90
[ 0.311264] ext4_sync_file+0x151/0x500 [ext4]
[ 0.311267] do_fsync+0x38/0x60
[ 0.311270] SyS_fsync+0xc/0x10
[ 0.311272] do_syscall_64+0x6f/0x170
[ 0.311274] entry_SYSCALL_64_after_hwframe+0x42/0xb7

In the original patch, wbt_done is waking up all the exclusive
processes in the wait queue, which can cause a thundering herd
if there is a large number of writer threads in the queue. The
original intention of the code seems to be to wake up one thread
only however, it uses wake_up_all() in __wbt_done(), and then
uses the following check in __wbt_wait to have only one thread
actually get out of the wait loop:

if (waitqueue_active(&rqw->wait) &&
rqw->wait.head.next != &wait->entry)
return false;

The problem with this is that the wait entry in wbt_wait is
define with DEFINE_WAIT, which uses the autoremove wakeup function.
That means that the above check is invalid - the wait entry will
have been removed from the queue already by the time we hit the
check in the loop.

Secondly, auto-removing the wait entries also means that the wait
queue essentially gets reordered "randomly" (e.g. threads re-add
themselves in the order they got to run after being woken up).
Additionally, new requests entering wbt_wait might overtake requests
that were queued earlier, because the wait queue will be
(temporarily) empty after the wake_up_all, so the waitqueue_active
check will not stop them. This can cause certain threads to starve
under high load.

The fix is to leave the woken up requests in the queue and remove
them in finish_wait() once the current thread breaks out of the
wait loop in __wbt_wait. This will ensure new requests always
end up at the back of the queue, and they won't overtake requests
that are already in the wait queue. With that change, the loop
in wbt_wait is also in line with many other wait loops in the kernel.
Waking up just one thread drastically reduces lock contention, as
does moving the wait queue add/remove out of the loop.

A significant drop in lockdep's lock contention numbers is seen when
running the test application on the patched kernel.

Signed-off-by: Anchal Agarwal
Signed-off-by: Frank van der Linden
Signed-off-by: Jens Axboe

Anchal Agarwal
2018-08-08 04:40:49 +0800

09 Jul, 2018

2 commits

c1c80384c block: remove external dependency on wbt_flags ... Browse Code »

We don't really need to save this stuff in the core block code, we can
just pass the bio back into the helpers later on to derive the same
flags and update the rq->wbt_flags appropriately.

Signed-off-by: Josef Bacik
Signed-off-by: Jens Axboe

Josef Bacik
2018-07-09 23:07:54 +0800
a79050434 blk-rq-qos: refactor out common elements of blk-wbt ... Browse Code »

blkcg-qos is going to do essentially what wbt does, only on a cgroup
basis. Break out the common code that will be shared between blkcg-qos
and wbt into blk-rq-qos.* so they can both utilize the same
infrastructure.

Signed-off-by: Josef Bacik
Signed-off-by: Jens Axboe

Josef Bacik
2018-07-09 23:07:54 +0800

09 May, 2018

6 commits

544ccc8dc block: get rid of struct blk_issue_stat ... Browse Code »

struct blk_issue_stat squashes three things into one u64:

- The time the driver started working on a request
- The original size of the request (for the io.low controller)
- Flags for writeback throttling

It turns out that on x86_64, we have a 4 byte hole in struct request
which we can fill with the non-timestamp fields from blk_issue_stat,
simplifying things quite a bit.

Signed-off-by: Omar Sandoval
Signed-off-by: Jens Axboe

Omar Sandoval
2018-05-09 22:33:05 +0800
a8a459417 block: pass struct request instead of struct blk_issue_stat to wbt ... Browse Code »

issue_stat is going to go away, so first make writeback throttling take
the containing request, update the internal wbt helpers accordingly, and
change rwb->sync_cookie to be the request pointer instead of the
issue_stat pointer. No functional change.

Signed-off-by: Omar Sandoval
Signed-off-by: Jens Axboe

Omar Sandoval
2018-05-09 22:33:02 +0800
934031a12 block: move some wbt helpers to blk-wbt.c ... Browse Code »

A few helpers are only used from blk-wbt.c, so move them there, and put
wbt_track() behind the CONFIG_BLK_WBT typedef. This is in preparation
for changing how the wbt flags are tracked.

Signed-off-by: Omar Sandoval
Signed-off-by: Jens Axboe

Omar Sandoval
2018-05-09 22:33:00 +0800
782f56977 blk-wbt: throttle discards like background writes ... Browse Code »

Throttle discards like we would any background write. Discards should
be background activity, so if they are impacting foreground IO, then
we will throttle them down.

Reviewed-by: Darrick J. Wong
Reviewed-by: Omar Sandoval
Signed-off-by: Jens Axboe

Jens Axboe
2018-05-09 05:11:02 +0800
8bea60901 blk-wbt: pass in enum wbt_flags to get_rq_wait() ... Browse Code »

This is in preparation for having more write queues, in which
case we would have needed to pass in more information than just
a simple 'is_kswapd' boolean.

Reviewed-by: Darrick J. Wong
Reviewed-by: Omar Sandoval
Signed-off-by: Jens Axboe

Jens Axboe
2018-05-09 05:10:56 +0800
825843b0a blk-wbt: account any writing command as a write ... Browse Code »

We currently special case WRITE and FLUSH, but we should really
just include any command with the write bit set. This ensures
that we account DISCARD.

Reviewed-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Reviewed-by: Omar Sandoval
Signed-off-by: Jens Axboe

Jens Axboe
2018-05-09 05:10:49 +0800

07 Feb, 2018

1 commit

5235553d8 blk-wbt: account flush requests correctly ... Browse Code »

Mikulas reported a workload that saw bad performance, and figured
out what it was due to various other types of requests being
accounted as reads. Flush requests, for instance. Due to the
high latency of those, we heavily throttle the writes to keep
the latencies in balance. But they really should be accounted
as writes.

Fix this by checking the exact type of the request. If it's a
read, account as a read, if it's a write or a flush, account
as a write. Any other request we disregard. Previously everything
would have been mistakenly accounted as reads.

Reported-by: Mikulas Patocka
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Jens Axboe

Jens Axboe
2018-02-07 05:14:03 +0800

24 Nov, 2017

3 commits

3dfbdc44d blk-wbt: fix comments typo ... Browse Code »

Signed-off-by: weiping zhang
Signed-off-by: Jens Axboe

weiping zhang
2017-11-24 13:00:20 +0800
62d772fa9 blk-wbt: move wbt_clear_stat to common place in wbt_done ... Browse Code »

wbt_done call wbt_clear_stat no matter current stat was tracked
or not, move it to common place.

Signed-off-by: weiping zhang
Signed-off-by: Jens Axboe

weiping zhang
2017-11-24 13:00:18 +0800
612ea091f blk-wbt: remove duplicated setting in wbt_init ... Browse Code »

rwb->wc and rwb->queue_depth were overwritten by wbt_set_write_cache and
wbt_set_queue_depth, remove the default setting.

Signed-off-by: weiping zhang
Signed-off-by: Jens Axboe

weiping zhang
2017-11-24 13:00:15 +0800

15 Nov, 2017

1 commit

e2c5923c3 Merge branch 'for-4.15/block' of git://git.kernel.dk/linux-block ... Browse Code »

Pull core block layer updates from Jens Axboe:
"This is the main pull request for block storage for 4.15-rc1.

Nothing out of the ordinary in here, and no API changes or anything
like that. Just various new features for drivers, core changes, etc.
In particular, this pull request contains:

- A patch series from Bart, closing the whole on blk/scsi-mq queue
quescing.

- A series from Christoph, building towards hidden gendisks (for
multipath) and ability to move bio chains around.

- NVMe
- Support for native multipath for NVMe (Christoph).
- Userspace notifications for AENs (Keith).
- Command side-effects support (Keith).
- SGL support (Chaitanya Kulkarni)
- FC fixes and improvements (James Smart)
- Lots of fixes and tweaks (Various)

- bcache
- New maintainer (Michael Lyle)
- Writeback control improvements (Michael)
- Various fixes (Coly, Elena, Eric, Liang, et al)

- lightnvm updates, mostly centered around the pblk interface
(Javier, Hans, and Rakesh).

- Removal of unused bio/bvec kmap atomic interfaces (me, Christoph)

- Writeback series that fix the much discussed hundreds of millions
of sync-all units. This goes all the way, as discussed previously
(me).

- Fix for missing wakeup on writeback timer adjustments (Yafang
Shao).

- Fix laptop mode on blk-mq (me).

- {mq,name} tupple lookup for IO schedulers, allowing us to have
alias names. This means you can use 'deadline' on both !mq and on
mq (where it's called mq-deadline). (me).

- blktrace race fix, oopsing on sg load (me).

- blk-mq optimizations (me).

- Obscure waitqueue race fix for kyber (Omar).

- NBD fixes (Josef).

- Disable writeback throttling by default on bfq, like we do on cfq
(Luca Miccio).

- Series from Ming that enable us to treat flush requests on blk-mq
like any other request. This is a really nice cleanup.

- Series from Ming that improves merging on blk-mq with schedulers,
getting us closer to flipping the switch on scsi-mq again.

- BFQ updates (Paolo).

- blk-mq atomic flags memory ordering fixes (Peter Z).

- Loop cgroup support (Shaohua).

- Lots of minor fixes from lots of different folks, both for core and
driver code"

* 'for-4.15/block' of git://git.kernel.dk/linux-block: (294 commits)
nvme: fix visibility of "uuid" ns attribute
blk-mq: fixup some comment typos and lengths
ide: ide-atapi: fix compile error with defining macro DEBUG
blk-mq: improve tag waiting setup for non-shared tags
brd: remove unused brd_mutex
blk-mq: only run the hardware queue if IO is pending
block: avoid null pointer dereference on null disk
fs: guard_bio_eod() needs to consider partitions
xtensa/simdisk: fix compile error
nvme: expose subsys attribute to sysfs
nvme: create 'slaves' and 'holders' entries for hidden controllers
block: create 'slaves' and 'holders' entries for hidden gendisks
nvme: also expose the namespace identification sysfs files for mpath nodes
nvme: implement multipath access to nvme subsystems
nvme: track shared namespaces
nvme: introduce a nvme_ns_ids structure
nvme: track subsystems
block, nvme: Introduce blk_mq_req_flags_t
block, scsi: Make SCSI quiesce and resume work reliably
block: Add the QUEUE_FLAG_PREEMPT_ONLY request queue flag
...

Linus Torvalds
2017-11-15 07:32:19 +0800