Eric Lee / smarc-fsl-linux-kernel

21 Jul, 2016

1 commit

72ef799b3 block: do not merge requests without consulting with io scheduler ... Browse Code »

Before merging a bio into an existing request, io scheduler is called to
get its approval first. However, the requests that come from a plug
flush may get merged by block layer without consulting with io
scheduler.

In case of CFQ, this can cause fairness problems. For instance, if a
request gets merged into a low weight cgroup's request, high weight cgroup
now will depend on low weight cgroup to get scheduled. If high weigt cgroup
needs that io request to complete before submitting more requests, then it
will also lose its timeslice.

Following script demonstrates the problem. Group g1 has a low weight, g2
and g3 have equal high weights but g2's requests are adjacent to g1's
requests so they are subject to merging. Due to these merges, g2 gets
poor disk time allocation.

cat > cfq-merge-repro.sh << "EOF"
#!/bin/bash
set -e

IO_ROOT=/mnt-cgroup/io

mkdir -p $IO_ROOT

if ! mount | grep -qw $IO_ROOT; then
mount -t cgroup none -oblkio $IO_ROOT
fi

cd $IO_ROOT

for i in g1 g2 g3; do
if [ -d $i ]; then
rmdir $i
fi
done

mkdir g1 && echo 10 > g1/blkio.weight
mkdir g2 && echo 495 > g2/blkio.weight
mkdir g3 && echo 495 > g3/blkio.weight

RUNTIME=10

(echo $BASHPID > g1/cgroup.procs &&
fio --readonly --name name1 --filename /dev/sdb \
--rw read --size 64k --bs 64k --time_based \
--runtime=$RUNTIME --offset=0k &> /dev/null)&

(echo $BASHPID > g2/cgroup.procs &&
fio --readonly --name name1 --filename /dev/sdb \
--rw read --size 64k --bs 64k --time_based \
--runtime=$RUNTIME --offset=64k &> /dev/null)&

(echo $BASHPID > g3/cgroup.procs &&
fio --readonly --name name1 --filename /dev/sdb \
--rw read --size 64k --bs 64k --time_based \
--runtime=$RUNTIME --offset=256k &> /dev/null)&

sleep $((RUNTIME+1))

for i in g1 g2 g3; do
echo ---- $i ----
cat $i/blkio.time
done

EOF
# ./cfq-merge-repro.sh
---- g1 ----
8:16 162
---- g2 ----
8:16 165
---- g3 ----
8:16 686

After applying the patch:

# ./cfq-merge-repro.sh
---- g1 ----
8:16 90
---- g2 ----
8:16 445
---- g3 ----
8:16 471

Signed-off-by: Tahsin Erdogan
Signed-off-by: Jens Axboe

Tahsin Erdogan
2016-07-21 11:35:12 +0800

28 Jun, 2016

1 commit

9828c2c6c block: Convert fifo_time from ulong to u64 ... Browse Code »

Currently rq->fifo_time is unsigned long but CFQ stores nanosecond
timestamp in it which would overflow on 32-bit archs. Convert it to u64
to avoid the overflow. Since the rq->fifo_time is unioned with struct
call_single_data(), this does not change the size of struct request in
any way.

We have to slightly fixup block/deadline-iosched.c so that comparison
happens in the right types.

Fixes: 9a7f38c42c2b92391d9dabaf9f51df7cfe5608e4
Signed-off-by: Jan Kara
Signed-off-by: Jens Axboe

Jan Kara
2016-06-28 22:21:44 +0800

02 Feb, 2016

1 commit

e502fb8f8 deadline: remove unused struct member ... Browse Code »

commit 63de428b139d3d31d86ebe25ae97b33f6540fb7e ("deadline-iosched:
allow non-sequential batching") removed last use of last_sector.

Signed-off-by: Tahsin Erdogan
Reviewed-by: Jeff Moyer
Signed-off-by: Jens Axboe

Tahsin Erdogan
2016-02-02 00:09:55 +0800

25 Feb, 2014

1 commit

8b4922d31 block: Stop abusing csd.list for fifo_time ... Browse Code »

Block layer currently abuses rq->csd.list.next for storing fifo_time.
That is a terrible hack and completely unnecessary as well. Union
achieves the same space saving in a cleaner way.

Signed-off-by: Jan Kara
Cc: Andrew Morton
Cc: Christoph Hellwig
Cc: Ingo Molnar
Cc: Jens Axboe
Signed-off-by: Frederic Weisbecker
Signed-off-by: Jens Axboe

Jan Kara
2014-02-25 06:46:32 +0800

12 Sep, 2013

1 commit

c1b511eb2 block: Convert kmalloc_node(...GFP_ZERO...) to kzalloc_node(...) ... Browse Code »

Use the helper function instead of __GFP_ZERO.

Signed-off-by: Joe Perches
Signed-off-by: Jens Axboe

Joe Perches
2013-09-12 03:22:03 +0800

03 Jul, 2013

1 commit

d50235b7b elevator: Fix a race in elevator switching ... Browse Code »

There's a race between elevator switching and normal io operation.
Because the allocation of struct elevator_queue and struct elevator_data
don't in a atomic operation.So there are have chance to use NULL
->elevator_data.
For example:
Thread A: Thread B
blk_queu_bio elevator_switch
spin_lock_irq(q->queue_block) elevator_alloc
elv_merge elevator_init_fn

Because call elevator_alloc, it can't hold queue_lock and the
->elevator_data is NULL.So at the same time, threadA call elv_merge and
nedd some info of elevator_data.So the crash happened.

Move the elevator_alloc into func elevator_init_fn, it make the
operations in a atomic operation.

Using the follow method can easy reproduce this bug
1:dd if=/dev/sdb of=/dev/null
2:while true;do echo noop > scheduler;echo deadline > scheduler;done

The test method also use this method.

Signed-off-by: Jianpeng Ma
Signed-off-by: Jens Axboe

Jianpeng Ma
2013-07-03 19:25:24 +0800

24 Mar, 2013

1 commit

f73a1c7d1 block: Add bio_end_sector() ... Browse Code »

Just a little convenience macro - main reason to add it now is preparing
for immutable bio vecs, it'll reduce the size of the patch that puts
bi_sector/bi_size/bi_idx into a struct bvec_iter.

Signed-off-by: Kent Overstreet
CC: Jens Axboe
CC: Lars Ellenberg
CC: Jiri Kosina
CC: Alasdair Kergon
CC: dm-devel@redhat.com
CC: Neil Brown
CC: Martin Schwidefsky
CC: Heiko Carstens
CC: linux-s390@vger.kernel.org
CC: Chris Mason
CC: Steven Whitehouse
Acked-by: Steven Whitehouse

Kent Overstreet
2013-03-24 05:15:29 +0800

10 Dec, 2012

1 commit

75274551c deadline: Allow 0ms deadline latency, increase the read speed ... Browse Code »

Change a timer compare from after to after-equals, thus allowing
0 timeout and making deadline schedule FIFO.

Signed-off-by: xiaobing tu
Signed-off-by: Jens Axboe

xiaobing tu
2012-12-10 02:19:23 +0800

07 Mar, 2012

1 commit

b2fab5acd elevator: make elevator_init_fn() return 0/-errno ... Browse Code »

elevator_ops->elevator_init_fn() has a weird return value. It returns
a void * which the caller should assign to q->elevator->elevator_data
and %NULL return denotes init failure.

Update such that it returns integer 0/-errno and sets elevator_data
directly as necessary.

This makes the interface more conventional and eases further cleanup.

Signed-off-by: Tejun Heo
Cc: Vivek Goyal
Signed-off-by: Jens Axboe

Tejun Heo
2012-03-07 04:27:21 +0800

14 Dec, 2011

1 commit

3d3c2379f block, cfq: move icq cache management to block core ... Browse Code »

Let elevators set ->icq_size and ->icq_align in elevator_type and
elv_register() and elv_unregister() respectively create and destroy
kmem_cache for icq.

* elv_register() now can return failure. All callers updated.

* icq caches are automatically named "ELVNAME_io_cq".

* cfq_slab_setup/kill() are collapsed into cfq_init/exit().

* While at it, minor indentation change for iosched_cfq.elevator_name
for consistency.

This will help moving icq management to block core. This doesn't
introduce any functional change.

Signed-off-by: Tejun Heo
Signed-off-by: Jens Axboe

Tejun Heo
2011-12-14 07:33:42 +0800

03 Jun, 2011

1 commit

796d5116c iosched: prevent aliased requests from starving other I/O ... Browse Code »

Hi, Jens,

If you recall, I posted an RFC patch for this back in July of last year:
http://lkml.org/lkml/2010/7/13/279

The basic problem is that a process can issue a never-ending stream of
async direct I/Os to the same sector on a device, thus starving out
other I/O in the system (due to the way the alias handling works in both
cfq and deadline). The solution I proposed back then was to start
dispatching from the fifo after a certain number of aliases had been
dispatched. Vivek asked why we had to treat aliases differently at all,
and I never had a good answer. So, I put together a simple patch which
allows aliases to be added to the rb tree (it adds them to the right,
though that doesn't matter as the order isn't guaranteed anyway). I
think this is the preferred solution, as it doesn't break up time slices
in CFQ or batches in deadline. I've tested it, and it does solve the
starvation issue. Let me know what you think.

Cheers,
Jeff

Signed-off-by: Jeff Moyer
Signed-off-by: Jens Axboe

Jeff Moyer
2011-06-03 03:19:05 +0800

10 Mar, 2011

1 commit

7eaceacca block: remove per-queue plugging ... Browse Code »
86

Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().

Signed-off-by: Jens Axboe

Jens Axboe
2011-03-10 15:52:07 +0800

11 May, 2009

1 commit

83096ebf1 block: convert to pos and nr_sectors accessors ... Browse Code »

With recent cleanups, there is no place where low level driver
directly manipulates request fields. This means that the 'hard'
request fields always equal the !hard fields. Convert all
rq->sectors, nr_sectors and current_nr_sectors references to
accessors.

While at it, drop superflous blk_rq_pos() < 0 test in swim.c.

[ Impact: use pos and nr_sectors accessors ]

Signed-off-by: Tejun Heo
Acked-by: Geert Uytterhoeven
Tested-by: Grant Likely
Acked-by: Grant Likely
Tested-by: Adrian McMenamin
Acked-by: Adrian McMenamin
Acked-by: Mike Miller
Cc: James Bottomley
Cc: Bartlomiej Zolnierkiewicz
Cc: Borislav Petkov
Cc: Sergei Shtylyov
Cc: Eric Moore
Cc: Alan Stern
Cc: FUJITA Tomonori
Cc: Pete Zaitcev
Cc: Stephen Rothwell
Cc: Paul Clements
Cc: Tim Waugh
Cc: Jeff Garzik
Cc: Jeremy Fitzhardinge
Cc: Alex Dubov
Cc: David Woodhouse
Cc: Martin Schwidefsky
Cc: Dario Ballabio
Cc: David S. Miller
Cc: Rusty Russell
Cc: unsik Kim
Cc: Laurent Vivier
Signed-off-by: Jens Axboe

Tejun Heo
2009-05-11 15:50:54 +0800

29 Dec, 2008

1 commit

b374d18a4 block: get rid of elevator_t typedef ... Browse Code »

Just use struct elevator_queue everywhere instead.

Signed-off-by: Jens Axboe

Jens Axboe
2008-12-29 15:29:50 +0800

09 Oct, 2008

2 commits

4fb72f764 deadline-iosched: non-functional fixes ... Browse Code »

* convert goto to simpler while loop;
* use rq_end_sector() instead of computing manually;
* fix false comments;
* remove spurious whitespace;
* convert rq_rb_root macro to an inline function.

Signed-off-by: Aaron Carroll
Signed-off-by: Jens Axboe

Aaron Carroll
2008-10-09 14:56:03 +0800
63de428b1 deadline-iosched: allow non-sequential batching ... Browse Code »

Deadline currently only batches sector-contiguous requests, so except
for a few circumstances (e.g. requests in a single direction), it is
essentially first come first served. This is bad for throughput, so
change it to CSCAN, which means requests in a batch do not need to be
sequential and are issued in increasing sector order.

Signed-off-by: Aaron Carroll
Signed-off-by: Jens Axboe

Aaron Carroll
2008-10-09 14:56:02 +0800

18 Dec, 2007

1 commit

2fdd82bd8 block: let elv_register() return void ... Browse Code »

elv_register() always returns 0, and there isn't anything it does where
it should return an error (the only error condition is so grave that
it's handled with a BUG_ON).

Signed-off-by: Adrian Bunk
Signed-off-by: Jens Axboe

Adrian Bunk
2007-12-18 15:29:28 +0800

02 Nov, 2007

3 commits

6f5d8aa63 Deadline iosched: Fix batching fairness ... Browse Code »

After switching data directions, deadline always starts the next batch
from the lowest-sector request. This gives excessive deadline expiries
and large latency and throughput disparity between high- and low-sector
requests; an order of magnitude in some tests.

This patch changes the batching behaviour so new batches start from the
request whose expiry is earliest.

Signed-off-by: Aaron Carroll
Signed-off-by: Jens Axboe

Aaron Carroll
2007-11-02 15:47:25 +0800
dfb3d72a9 Deadline iosched: Reset batch for ordered requests ... Browse Code »

The deadline I/O scheduler does not reset the batch count when starting
a new batch at a higher-sectored request. This means the second and
subsequent batch in the same data direction will never exceed a single
request in size whenever higher-sectored requests are pending.

This patch gives new batches in the same data direction as old ones
their full quota of requests by resetting the batch count.

Signed-off-by: Aaron Carroll
Signed-off-by: Jens Axboe

Aaron Carroll
2007-11-02 15:47:25 +0800
5d1a53662 Deadline iosched: Factor out finding latter reques ... Browse Code »

Factor finding the next request in sector-sorted order into
a function deadline_latter_request.

Signed-off-by: Aaron Carroll
Signed-off-by: Jens Axboe

Aaron Carroll
2007-11-02 15:47:25 +0800

24 Jul, 2007

1 commit

165125e1e [BLOCK] Get rid of request_queue_t typedef ... Browse Code »

Some of the code has been gradually transitioned to using the proper
struct request_queue, but there's lots left. So do a full sweet of
the kernel and get rid of this typedef and replace its uses with
the proper type.

Signed-off-by: Jens Axboe

Jens Axboe
2007-07-24 15:28:11 +0800

18 Jul, 2007

1 commit

94f6030ca Slab allocators: Replace explicit zeroing with __GFP_ZERO ... Browse Code »

kmalloc_node() and kmem_cache_alloc_node() were not available in a zeroing
variant in the past. But with __GFP_ZERO it is possible now to do zeroing
while allocating.

Use __GFP_ZERO to remove the explicit clearing of memory via memset whereever
we can.

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2007-07-18 01:23:02 +0800

01 Dec, 2006

1 commit

bb37b94c6 [BLOCK] Cleanup unused variable passing ... Browse Code »

- ->init_queue() does not need the elevator passed in
- ->put_request() is a hot path and need not have the queue passed in
- cfq_update_io_seektime() does not need cfqd passed in

Signed-off-by: Jens Axboe

Jens Axboe
2006-12-01 17:42:33 +0800

01 Oct, 2006

4 commits

0fe234795 [PATCH] Update axboe@suse.de email address ... Browse Code »

As people often look for the copyright in files to see who to mail,
update the link to a neutral one.

Signed-off-by: Jens Axboe

Jens Axboe
2006-10-01 02:52:34 +0800
8840faa1e [PATCH] deadline-iosched: remove elevator private drq request type ... Browse Code »

A big win, we now save an allocation/free on each request! With the
previous rb/hash abstractions, we can just reuse queuelist/donelist
for the FIFO data and be done with it.

Signed-off-by: Jens Axboe

Jens Axboe
2006-10-01 02:27:00 +0800
b8aca35af [PATCH] deadline-iosched: migrate to using the elevator rb functions ... Browse Code »

This removes the rbtree handling from deadline.

Signed-off-by: Jens Axboe

Jens Axboe
2006-10-01 02:26:58 +0800
9817064b6 [PATCH] elevator: move the backmerging logic into the elevator core ... Browse Code »

Right now, every IO scheduler implements its own backmerging (except for
noop, which does no merging). That results in duplicated code for
essentially the same operation, which is never a good thing. This patch
moves the backmerging out of the io schedulers and into the elevator
core. We save 1.6kb of text and as a bonus get backmerging for noop as
well. Win-win!

Signed-off-by: Jens Axboe

Jens Axboe
2006-10-01 02:26:56 +0800

01 Jul, 2006

1 commit

6ab3d5624 Remove obsolete #include <linux/config.h> ... Browse Code »

Signed-off-by: Jörn Engel
Signed-off-by: Adrian Bunk

Jörn Engel
2006-07-01 01:25:36 +0800

23 Jun, 2006

2 commits

dd67d0515 [PATCH] rbtree: support functions used by the io schedulers ... Browse Code »

They all duplicate macros to check for empty root and/or node, and
clearing a node. So put those in rbtree.h.

Signed-off-by: Jens Axboe

Jens Axboe
2006-06-23 23:10:39 +0800
bae386f78 [PATCH] iosched: use hlist for request hashtable ... Browse Code »

Use hlist instead of list_head for request hashtable in deadline-iosched
and as-iosched. It also can remove the flag to know hashed or unhashed.

Signed-off-by: Akinobu Mita
Signed-off-by: Jens Axboe

block/as-iosched.c | 45 +++++++++++++++++++--------------------------
block/deadline-iosched.c | 39 ++++++++++++++++-----------------------
2 files changed, 35 insertions(+), 49 deletions(-)

Akinobu Mita
2006-06-23 23:10:38 +0800

21 Jun, 2006

1 commit

2edc322d4 Merge git://git.infradead.org/~dwmw2/rbtree-2.6 ... Browse Code »

* git://git.infradead.org/~dwmw2/rbtree-2.6:
[RBTREE] Switch rb_colour() et al to en_US spelling of 'color' for consistency
Update UML kernel/physmem.c to use rb_parent() accessor macro
[RBTREE] Update hrtimers to use rb_parent() accessor macro.
[RBTREE] Add explicit alignment to sizeof(long) for struct rb_node.
[RBTREE] Merge colour and parent fields of struct rb_node.
[RBTREE] Remove dead code in rb_erase()
[RBTREE] Update JFFS2 to use rb_parent() accessor macro.
[RBTREE] Update eventpoll.c to use rb_parent() accessor macro.
[RBTREE] Update key.c to use rb_parent() accessor macro.
[RBTREE] Update ext3 to use rb_parent() accessor macro.
[RBTREE] Change rbtree off-tree marking in I/O schedulers.
[RBTREE] Add accessor macros for colour and parent fields of rb_node

Linus Torvalds
2006-06-21 05:51:22 +0800

09 Jun, 2006

1 commit

bc1c11697 [PATCH] elevator switching race ... Browse Code »

There's a race between shutting down one io scheduler and firing up the
next, in which a new io could enter and cause the io scheduler to be
invoked with bad or NULL data.

To fix this, we need to maintain the queue lock for a bit longer.
Unfortunately we cannot do that, since the elevator init requires to be
run without the lock held. This isn't easily fixable, without also
changing the mempool API. So split the initialization into two parts,
and alloc-init operation and an attach operation. Then we can
preallocate the io scheduler and related structures, and run the attach
inside the lock after we detach the old one.

This patch has survived 30 minutes of 1 second io scheduler switching
with a very busy io load.

Signed-off-by: Jens Axboe
Signed-off-by: Linus Torvalds

Jens Axboe
2006-06-09 06:14:23 +0800

21 Apr, 2006

1 commit

3db3a4453 [RBTREE] Change rbtree off-tree marking in I/O schedulers. ... Browse Code »

They were abusing the rb_color field to mark nodes which weren't currently
on the tree. Fix that to use the same method as eventpoll did -- setting
the parent pointer to point back to itself. And use the appropriate
accessor macros for setting and reading the parent.

Signed-off-by: David Woodhouse

David Woodhouse
2006-04-21 20:15:17 +0800

19 Mar, 2006

2 commits

e572ec7e4 [PATCH] fix rmmod problems with elevator attributes, clean them up Browse Code »

Al Viro
2006-03-19 11:27:18 +0800
3d1ab40f4 [PATCH] elevator_t lifetime rules and sysfs fixes Browse Code »

Al Viro
2006-03-19 07:35:43 +0800

06 Jan, 2006

1 commit

64100099e [BLOCK] mark some block/ variables cons ... Browse Code »

the patch below marks various read-only variables in block/* as const,
so that gcc can optimize the use of them; eg gcc will replace the use by
the value directly now and will even remove the memory usage of these.

Signed-off-by: Arjan van de Ven
Signed-off-by: Jens Axboe

Arjan van de Ven
2006-01-06 16:46:02 +0800

19 Nov, 2005

1 commit

eb97b73d7 [BLOCK] new block/ directory comment tidy ... Browse Code »

Some leftover comments referring to drivers/block that are now block/.
They don't add any information we don't already have, so kill them.

Signed-off-by: Coywolf Qi Hunt
Signed-off-by: Jens Axboe

Coywolf Qi Hunt
2005-11-19 04:59:31 +0800

04 Nov, 2005

1 commit

3a65dfe8c [BLOCK] Move all core block layer code to new block/ directory ... Browse Code »

drivers/block/ is right now a mix of core and driver parts. Lets move
the core parts to a new top level directory. Al will move the fs/
related block parts to block/ next.

Signed-off-by: Jens Axboe

Jens Axboe
2005-11-04 15:43:35 +0800