Eric Lee / smarc-fsl-linux-kernel

23 Mar, 2011

3 commits

04521db04 blk-throttle: Reset group slice when limits are changed ... Browse Code »

Lina reported that if throttle limits are initially very high and then
dropped, then no new bio might be dispatched for a long time. And the
reason being that after dropping the limits we don't reset the existing
slice and do the rate calculation with new low rate and account the bios
dispatched at high rate. To fix it, reset the slice upon rate change.

https://lkml.org/lkml/2011/3/10/298

Another problem with very high limit is that we never queued the
bio on throtl service tree. That means we kept on extending the
group slice but never trimmed it. Fix that also by regulary
trimming the slice even if bio is not being queued up.

Reported-by: Lina Lu
Signed-off-by: Vivek Goyal
Signed-off-by: Jens Axboe

Vivek Goyal
2011-03-23 04:55:00 +0800
9026e521c blk-cgroup: Only give unaccounted_time under debug ... Browse Code »

This change moves unaccounted_time to only be reported when
CONFIG_DEBUG_BLK_CGROUP is true.

Signed-off-by: Justin TerAvest
Signed-off-by: Jens Axboe

Justin TerAvest
2011-03-23 04:26:54 +0800
eda5e0c91 cfq-iosched: Don't set active queue in preempt ... Browse Code »

Commit "Add unaccounted time to timeslice_used" changed the behavior of
cfq_preempt_queue to set cfqq active. Vivek pointed out that other
preemption rules might get involved, so we shouldn't manually set which
queue is active.

This cleans up the code to just clear the queue stats at preemption
time.

Signed-off-by: Justin TerAvest
Signed-off-by: Jens Axboe

Justin TerAvest
2011-03-23 04:26:49 +0800

22 Mar, 2011

1 commit

1e9bb8808 block: fix non-atomic access to genhd inflight structures ... Browse Code »

After the stack plugging introduction, these are called lockless.
Ensure that the counters are updated atomically.

Signed-off-by: Shaohua Li
Signed-off-by: Jens Axboe

Shaohua Li
2011-03-22 15:35:35 +0800

21 Mar, 2011

1 commit

5e84ea3a9 block: attempt to merge with existing requests on plug flush ... Browse Code »

One of the disadvantages of on-stack plugging is that we potentially
lose out on merging since all pending IO isn't always visible to
everybody. When we flush the on-stack plugs, right now we don't do
any checks to see if potential merge candidates could be utilized.

Correct this by adding a new insert variant, ELEVATOR_INSERT_SORT_MERGE.
It works just ELEVATOR_INSERT_SORT, but first checks whether we can
merge with an existing request before doing the insertion (if we fail
merging).

This fixes a regression with multiple processes issuing IO that
can be merged.

Thanks to Shaohua Li for testing and fixing
an accounting bug.

Signed-off-by: Jens Axboe

Jens Axboe
2011-03-21 17:14:27 +0800

19 Mar, 2011

1 commit

4345caba3 block: NULL dereference on error path in __blkdev_get() ... Browse Code »

"disk" is always NULL when we goto out. There was a check for this
before, but it was removed in 69e02c59a7d9 "block: Don't check events
while open is in progress".

Signed-off-by: Dan Carpenter
Acked-by: Tejun Heo
Signed-off-by: Jens Axboe

Dan Carpenter
2011-03-19 20:53:31 +0800

17 Mar, 2011

7 commits

8184f93ec cfq-iosched: Don't update group weights when on service tree ... Browse Code »

Version 3 is updated to apply to for-2.6.39/core.

For version 2, I took Vivek's advice and made sure we update the group
weight from cfq_group_service_tree_add().

If a weight was updated while a group is on the service tree, the
calculation for the total weight of the service tree can be adjusted
improperly, which either leads to bad service tree weights, or
potentially crashes (if total_weight becomes 0).

This patch defers updates to the weight until a group is off the service
tree.

Signed-off-by: Justin TerAvest
Acked-by: Vivek Goyal
Signed-off-by: Jens Axboe

Justin TerAvest
2011-03-17 23:12:36 +0800
95f28604a fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away ... Browse Code »

We don't have proper reference counting for this yet, so we run into
cases where the device is pulled and we OOPS on flushing the fs data.
This happens even though the dirty inodes have already been
migrated to the default_backing_dev_info.

Reported-by: Torsten Hilbrich
Tested-by: Torsten Hilbrich
Cc: stable@kernel.org
Signed-off-by: Jens Axboe

Jens Axboe
2011-03-17 18:13:12 +0800
a91a2785b block: Require subsystems to explicitly allocate bio_set integrity mempool ... Browse Code »

MD and DM create a new bio_set for every metadevice. Each bio_set has an
integrity mempool attached regardless of whether the metadevice is
capable of passing integrity metadata. This is a waste of memory.

Instead we defer the allocation decision to MD and DM since we know at
metadevice creation time whether integrity passthrough is needed or not.

Automatic integrity mempool allocation can then be removed from
bioset_create() and we make an explicit integrity allocation for the
fs_bio_set.

Signed-off-by: Martin K. Petersen
Reported-by: Zdenek Kabelac
Acked-by: Mike Snitzer
Signed-off-by: Jens Axboe

Martin K. Petersen
2011-03-17 18:11:05 +0800
82f04ab47 jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging ... Browse Code »

'write_op' was still used, even though it was always WRITE_SYNC now.
Add plugging around the cases where it submits IO, and flush them
before we end up waiting for that IO.

Signed-off-by: Jens Axboe

Jens Axboe
2011-03-17 18:01:52 +0800
65ab80279 jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging ... Browse Code »

'write_op' was still used, even though it was always WRITE_SYNC now.
Add plugging around the cases where it submits IO, and flush them
before we end up waiting for that IO.

Signed-off-by: Jens Axboe

Jens Axboe
2011-03-17 17:56:45 +0800
4ee2491ed fs: make fsync_buffers_list() plug ... Browse Code »

It used WRITE_SYNC_PLUG before and potentially submits a batch
of IO, so lets enable plugging for this case.

Signed-off-by: Jens Axboe

Jens Axboe
2011-03-17 17:51:40 +0800
9b6096a65 mm: make generic_writepages() use plugging ... Browse Code »

This recovers a performance regression caused by the removal
of the per-device plugging.

Signed-off-by: Jens Axboe

Shaohua Li
2011-03-17 17:47:06 +0800

12 Mar, 2011

4 commits

167400d34 blk-cgroup: Add unaccounted time to timeslice_used. ... Browse Code »

There are two kind of times that tasks are not charged for: the first
seek and the extra time slice used over the allocated timeslice. Both
of these exported as a new unaccounted_time stat.

I think it would be good to have this reported in 'time' as well, but
that is probably a separate discussion.

Signed-off-by: Justin TerAvest
Signed-off-by: Jens Axboe

Justin TerAvest
2011-03-12 23:54:00 +0800
1f940bdfc block: fixup plugging stubs for !CONFIG_BLOCK ... Browse Code »

They used an older prototype, fix it up.

Reported-by: Randy Dunlap
Signed-off-by: Jens Axboe

Jens Axboe
2011-03-12 03:17:08 +0800
eba2ed9c9 block: remove obsolete comments for blkdev_issue_zeroout. ... Browse Code »

barrier is already removed, so remove the obsolete comments
in blkdev_issue_zeroout.

Cc: Jens Axboe
Signed-off-by: Tao Ma
Signed-off-by: Jens Axboe

Tao Ma
2011-03-12 03:13:54 +0800
805f6b5e1 blktrace: Use rq->cmd_flags directly in blk_add_trace_rq. ... Browse Code »

In blk_add_trace_rq, we only chose the minor 2 bits from
request's cmd_flags and did some check for discard.
so most of other flags(e.g, REQ_SYNC) are missing.

For example, with a sync write after blkparse we get:
8,16 1 1 0.001776503 7509 A WS 1349632 + 1024 cmd_flags directly to __blk_add_trace.

With this patch, after a sync write we get:
8,16 1 1 0.001776900 5425 A WS 1189888 + 1024
Acked-by: Jeff Moyer
Signed-off-by: Jens Axboe

Tao Ma
2011-03-12 03:11:59 +0800

10 Mar, 2011

23 commits

4c63f5646 Merge branch 'for-2.6.39/stack-plug' into for-2.6.39/core ... Browse Code »

Conflicts:
block/blk-core.c
block/blk-flush.c
drivers/md/raid1.c
drivers/md/raid10.c
drivers/md/raid5.c
fs/nilfs2/btnode.c
fs/nilfs2/mdt.c

Signed-off-by: Jens Axboe

Jens Axboe
2011-03-10 15:58:35 +0800
69d60eb96 blk-throttle: Use blk_plug in throttle dispatch ... Browse Code »

Use plug in throttle dispatch also as we are dispatching a bunch of
bios in throttle context and some of them might merge.

Signed-off-by: Vivek Goyal
Signed-off-by: Jens Axboe

Vivek Goyal
2011-03-10 15:52:27 +0800
721a9602e block: kill off REQ_UNPLUG ... Browse Code »

With the plugging now being explicitly controlled by the
submitter, callers need not pass down unplugging hints
to the block layer. If they want to unplug, it's because they
manually plugged on their own - in which case, they should just
unplug at will.

Signed-off-by: Jens Axboe

Jens Axboe
2011-03-10 15:52:27 +0800
cf15900e1 aio: remove request submission batching ... Browse Code »

This should be useless now that we have on-stack plugging. So lets just
kill it.

Signed-off-by: Jens Axboe

Jens Axboe
2011-03-10 15:52:27 +0800
9f5b94254 fs: make aio plug ... Browse Code »

Signed-off-by: Shaohua Li
Signed-off-by: Jens Axboe

Shaohua Li
2011-03-10 15:52:27 +0800
2ed1a6bcf fs: make mpage read/write_pages() plug ... Browse Code »

Signed-off-by: Jens Axboe

Jens Axboe
2011-03-10 15:52:26 +0800
5b417b187 read-ahead: use plugging ... Browse Code »

Signed-off-by: Jens Axboe

Jens Axboe
2011-03-10 15:52:26 +0800
55602dd66 fs: make generic file read/write functions plug ... Browse Code »

Signed-off-by: Jens Axboe

Jens Axboe
2011-03-10 15:52:26 +0800
7eaceacca block: remove per-queue plugging ... Browse Code »
86

Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().

Signed-off-by: Jens Axboe

Jens Axboe
2011-03-10 15:52:07 +0800
73c101011 block: initial patch for on-stack per-task plugging ... Browse Code »

This patch adds support for creating a queuing context outside
of the queue itself. This enables us to batch up pieces of IO
before grabbing the block device queue lock and submitting them to
the IO scheduler.

The context is created on the stack of the process and assigned in
the task structure, so that we can auto-unplug it if we hit a schedule
event.

The current queue plugging happens implicitly if IO is submitted to
an empty device, yet callers have to remember to unplug that IO when
they are going to wait for it. This is an ugly API and has caused bugs
in the past. Additionally, it requires hacks in the vm (->sync_page()
callback) to handle that logic. By switching to an explicit plugging
scheme we make the API a lot nicer and can get rid of the ->sync_page()
hack in the vm.

Signed-off-by: Jens Axboe

Jens Axboe
2011-03-10 15:45:54 +0800
a488e7497 scsi: convert to blk_delay_queue() ... Browse Code »

It was always abuse to reuse the plugging infrastructure for this,
convert it to the (new) real API for delaying queueing a bit. A
default delay of 3 msec is defined, to match the previous
behaviour.

Signed-off-by: Jens Axboe

Jens Axboe
2011-03-10 15:45:54 +0800
0a41e90bb ide-cd: convert to blk_delay_queue() for a short pause ... Browse Code »

It was always abuse to reuse the plugging infrastructure for this,
convert it to the (new) real API for delaying queueing a bit.

Signed-off-by: Jens Axboe
Acked-by: David S. Miller

Jens Axboe
2011-03-10 15:45:54 +0800
3cca6dc1c block: add API for delaying work/request_fn a little bit ... Browse Code »

Currently we use plugging for that, but as plugging is going away,
we need an alternative mechanism.

Signed-off-by: Jens Axboe

Jens Axboe
2011-03-10 15:45:54 +0800
cafb0bfca staging: Convert to bdops->check_events() ... Browse Code »

Convert two staging drivers - blkvsc_drv and cyasblkdev_block - from
->media_changed() to ->check_events(). The former always indicated
media changed while the latter always indicated media not changed.
Not sure what the drivers are trying to achieve but keep the original
behavior.

Signed-off-by: Tejun Heo
Acked-by: Greg Kroah-Hartman
Cc: Jens Axboe
Cc: Kay Sievers

Tejun Heo
2011-03-10 02:54:29 +0800
3c0d20609 pktcdvd: Convert to bdops->check_events() ... Browse Code »

Convert from ->media_changed() to ->check_events().

pktcdvd needs to forward all event related operations to the
underlying device. Forward ->check_events() instead of
->media_changed() and inherit disk->[async_]events.

Signed-off-by: Tejun Heo
Cc: Jens Axboe
Cc: Kay Sievers
Cc: Peter Osterlund

Tejun Heo
2011-03-10 02:54:28 +0800
6fac80e3a umem: Drop dummy ->media_changed() ... Browse Code »

umem doesn't implement media changed detection and there's no need to
implement dummy callback anymore. Remove it.

Signed-off-by: Tejun Heo
Cc: Jens Axboe
Cc: Kay Sievers

Tejun Heo
2011-03-10 02:54:28 +0800
ffe80cea3 s390/tape_block: Convert to bdops->check_events() ... Browse Code »

Convert from ->media_changed() to ->check_events().

s390/tape_block buffers media changed state and clears it on
revalidation. It will behave correctly with kernel event polling.

Signed-off-by: Tejun Heo
Cc: Jens Axboe
Cc: Kay Sievers
Cc: Martin Schwidefsky
Cc: Heiko Carstens

Tejun Heo
2011-03-10 02:54:28 +0800
f47350fde i2o_block: Convert to bdops->check_events() ... Browse Code »

Convert from ->media_changed() to ->check_events().

i2o_block buffers media changed state and clears it after reporting.
It will behave correctly with kernel event polling.

Signed-off-by: Tejun Heo
Cc: Jens Axboe
Cc: Kay Sievers
Cc: Markus Lidel

Tejun Heo
2011-03-10 02:54:28 +0800
3a200911a xsysace: Convert to bdops->check_events() ... Browse Code »

Convert from ->media_changed() to ->check_events().

xsysace buffers media changed state and clears it on revalidation. It
will behave correctly with kernel event polling.

Signed-off-by: Tejun Heo
Acked-by: Grant Likely
Cc: Jens Axboe
Cc: Kay Sievers

Tejun Heo
2011-03-10 02:54:28 +0800
aaa7c0154 ub: Convert to bdops->check_events() ... Browse Code »

Convert from ->media_changed() to ->check_events().

ub buffers media changed state and clears it on revalidation. It will
behave correctly with kernel event polling.

Signed-off-by: Tejun Heo
Cc: Jens Axboe
Cc: Kay Sievers
Cc: Pete Zaitcev

Tejun Heo
2011-03-10 02:54:28 +0800
4bbde7778 swim[3]: Convert to bdops->check_events() ... Browse Code »

Convert from ->media_changed() to ->check_events().

Both swim and swim3 buffer media changed state and clear it on
revalidation. They will behave correctly with kernel event polling.

Signed-off-by: Tejun Heo
Cc: Jens Axboe
Cc: Kay Sievers
Cc: Laurent Vivier
Cc: Benjamin Herrenschmidt

Tejun Heo
2011-03-10 02:54:28 +0800
507daea22 dac960: Convert to bdops->check_events() ... Browse Code »

Convert from ->media_changed() to ->check_events().

DAC960 media change notification seems to be one way (once set, never
cleared) and will generate spurious events when polled once the
condition triggers.

Signed-off-by: Tejun Heo
Cc: Jens Axboe
Cc: Kay Sievers

Tejun Heo
2011-03-10 02:54:28 +0800
b1b56b93f paride: Convert to bdops->check_events() ... Browse Code »

Convert paride drivers from ->media_changed() to ->check_events().

pcd and pd buffer and clear events after reporting; however, pf
unconditionally reports MEDIA_CHANGE and will generate spurious events
when polled.

Signed-off-by: Tejun Heo
Cc: Jens Axboe
Cc: Kay Sievers
Cc: Tim Waugh

Tejun Heo
2011-03-10 02:54:28 +0800