Doug / smarc-fsl-linux-kernel | Embedian Git Server

05 Oct, 2009

1 commit

23e018a1b block: get rid of kblock_schedule_delayed_work() ... Browse Code »

It was briefly introduced to allow CFQ to to delayed scheduling,
but we ended up removing that feature again. So lets kill the
function and export, and just switch CFQ back to the normal work
schedule since it is now passing in a '0' delay from all call
sites.

Signed-off-by: Jens Axboe

Jens Axboe
2009-10-05 17:03:58 +0800

04 Oct, 2009

1 commit

ac481c20e block: Topology ioctls ... Browse Code »

Not all users of the topology information want to use libblkid. Provide
the topology information through bdev ioctls.

Also clarify sector size comments for existing BLK ioctls.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-10-04 02:52:01 +0800

03 Oct, 2009

1 commit

8e2967555 cfq-iosched: implement slower async initiate and queue ramp up ... Browse Code »

This slowly ramps up the async queue depth based on the time
passed since the sync IO, and doesn't allow async at all until
a sync slice period has passed.

Signed-off-by: Jens Axboe

Jens Axboe
2009-10-03 22:27:13 +0800

02 Oct, 2009

2 commits

67efc9258 block: allow large discard requests ... Browse Code »

Currently we set the bio size to the byte equivalent of the blocks to
be trimmed when submitting the initial DISCARD ioctl. That means it
is subject to the max_hw_sectors limitation of the HBA which is
much lower than the size of a DISCARD request we can support.
Add a separate max_discard_sectors tunable to limit the size for discard
requests.

We limit the max discard request size in bytes to 32bit as that is the
limit for bio->bi_size. This could be much larger if we had a way to pass
that information through the block layer.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2009-10-02 03:19:34 +0800
c15227de1 block: use normal I/O path for discard requests ... Browse Code »

prepare_discard_fn() was being called in a place where memory allocation
was effectively impossible. This makes it inappropriate for all but
the most trivial translations of Linux's DISCARD operation to the block
command set. Additionally adding a payload there makes the ownership
of the bio backing unclear as it's now allocated by the device driver
and not the submitter as usual.

It is replaced with QUEUE_FLAG_DISCARD which is used to indicate whether
the queue supports discard operations or not. blkdev_issue_discard now
allocates a one-page, sector-length payload which is the right thing
for the common ATA and SCSI implementations.

The mtd implementation of prepare_discard_fn() is replaced with simply
checking for the request being a discard.

Largely based on a previous patch from Matthew Wilcox
which did the prepare_discard_fn but not the different payload allocation
yet.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2009-10-02 03:19:30 +0800

14 Sep, 2009

2 commits

746cd1e7e block: use blkdev_issue_discard in blk_ioctl_discard ... Browse Code »

blk_ioctl_discard duplicates large amounts of code from blkdev_issue_discard,
the only difference between the two is that blkdev_issue_discard needs to
send a barrier discard request and blk_ioctl_discard a non-barrier one,
and blk_ioctl_discard needs to wait on the request. To facilitates this
add a flags argument to blkdev_issue_discard to control both aspects of the
behaviour. This will be very useful later on for using the waiting
funcitonality for other callers.

Based on an earlier patch from Matthew Wilcox .

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2009-09-14 14:24:53 +0800
3c5820c74 block: Optimal I/O limit wrapper ... Browse Code »

Implement blk_limits_io_opt() and make blk_queue_io_opt() a wrapper
around it. DM needs this to avoid poking at the queue_limits directly.

Signed-off-by: Martin K. Petersen
Signed-off-by: Mike Snitzer
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-09-14 14:24:52 +0800

11 Sep, 2009

5 commits

01e97f6b8 block: enable rq CPU completion affinity by default ... Browse Code »

Test results here look good, and on big OLTP runs it has also shown
to significantly increase cycles attributed to the database and
cause a performance boost.

Signed-off-by: Jens Axboe

Jens Axboe
2009-09-11 20:34:33 +0800
fb1e75389 block: improve queue_should_plug() by looking at IO depths ... Browse Code »

Instead of just checking whether this device uses block layer
tagging, we can improve the detection by looking at the maximum
queue depth it has reached. If that crosses 4, then deem it a
queuing device.

This is important on high IOPS devices, since plugging hurts
the performance there (it can be as much as 10-15% of the sys
time).

Signed-off-by: Jens Axboe

Jens Axboe
2009-09-11 20:33:31 +0800
1f98a13f6 bio: first step in sanitizing the bio->bi_rw flag testing ... Browse Code »

Get rid of any functions that test for these bits and make callers
use bio_rw_flagged() directly. Then it is at least directly apparent
what variable and flag they check.

Signed-off-by: Jens Axboe

Jens Axboe
2009-09-11 20:33:31 +0800
80a761fd3 block: implement mixed merge of different failfast requests ... Browse Code »

Failfast has characteristics from other attributes. When issuing,
executing and successuflly completing requests, failfast doesn't make
any difference. It only affects how a request is handled on failure.
Allowing requests with different failfast settings to be merged cause
normal IOs to fail prematurely while not allowing has performance
penalties as failfast is used for read aheads which are likely to be
located near in-flight or to-be-issued normal IOs.

This patch introduces the concept of 'mixed merge'. A request is a
mixed merge if it is merge of segments which require different
handling on failure. Currently the only mixable attributes are
failfast ones (or lack thereof).

When a bio with different failfast settings is added to an existing
request or requests of different failfast settings are merged, the
merged request is marked mixed. Each bio carries failfast settings
and the request always tracks failfast state of the first bio. When
the request fails, blk_rq_err_bytes() can be used to determine how
many bytes can be safely failed without crossing into an area which
requires further retrials.

This allows request merging regardless of failfast settings while
keeping the failure handling correct.

This patch only implements mixed merge but doesn't enable it. The
next one will update SCSI to make use of mixed merge.

Signed-off-by: Tejun Heo
Cc: Niel Lambrechts
Signed-off-by: Jens Axboe

Tejun Heo
2009-09-11 20:33:30 +0800
a82afdfcb block: use the same failfast bits for bio and request ... Browse Code »

bio and request use the same set of failfast bits. This patch makes
the following changes to simplify things.

* enumify BIO_RW* bits and reorder bits such that BIOS_RW_FAILFAST_*
bits coincide with __REQ_FAILFAST_* bits.

* The above pushes BIO_RW_AHEAD out of sync with __REQ_FAILFAST_DEV
but the matching is useless anyway. init_request_from_bio() is
responsible for setting FAILFAST bits on FS requests and non-FS
requests never use BIO_RW_AHEAD. Drop the code and comment from
blk_rq_bio_prep().

* Define REQ_FAILFAST_MASK which is OR of all FAILFAST bits and
simplify FAILFAST flags handling in init_request_from_bio().

Signed-off-by: Tejun Heo
Signed-off-by: Jens Axboe

Tejun Heo
2009-09-11 20:33:27 +0800

01 Aug, 2009

1 commit

7c958e326 block: Add a wrapper for setting minimum request size without a queue ... Browse Code »

Introduce blk_limits_io_min() and make blk_queue_io_min() call it.

Signed-off-by: Mike Snitzer
Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-08-01 16:24:35 +0800

12 Jul, 2009

1 commit

373c0a7ed Fix compile error due to congestion_wait() changes ... Browse Code »

Move the definition of BLK_RW_ASYNC/BLK_RW_SYNC into linux/backing-dev.h
so that it is available to all callers of set/clear_bdi_congested().

This replaces commit 097041e576ee3a50d92dd643ee8ca65bf6a62e21 ("fuse:
Fix build error"), which will be reverted.

Signed-off-by: Trond Myklebust
Acked-by: Larry Finger
Cc: Jens Axboe
Cc: Miklos Szeredi
Signed-off-by: Linus Torvalds

Trond Myklebust
2009-07-12 02:22:26 +0800

11 Jul, 2009

2 commits

ecb554a84 block: fix sg SG_DXFER_TO_FROM_DEV regression ... Browse Code »

I overlooked SG_DXFER_TO_FROM_DEV support when I converted sg to use
the block layer mapping API (2.6.28).

Douglas Gilbert explained SG_DXFER_TO_FROM_DEV:

http://www.spinics.net/lists/linux-scsi/msg37135.html

=
The semantics of SG_DXFER_TO_FROM_DEV were:
- copy user space buffer to kernel (LLD) buffer
- do SCSI command which is assumed to be of the DATA_IN
(data from device) variety. This would overwrite
some or all of the kernel buffer
- copy kernel (LLD) buffer back to the user space.

The idea was to detect short reads by filling the original
user space buffer with some marker bytes ("0xec" it would
seem in this report). The "resid" value is a better way
of detecting short reads but that was only added this century
and requires co-operation from the LLD.
=

This patch changes the block layer mapping API to support this
semantics. This simply adds another field to struct rq_map_data and
enables __bio_copy_iov() to copy data from user space even with READ
requests.

It's better to add the flags field and kills null_mapped and the new
from_user fields in struct rq_map_data but that approach makes it
difficult to send this patch to stable trees because st and osst
drivers use struct rq_map_data (they were converted to use the block
layer in 2.6.29 and 2.6.30). Well, I should clean up the block layer
mapping API.

zhou sf reported this regiression and tested this patch:

http://www.spinics.net/lists/linux-scsi/msg37128.html
http://www.spinics.net/lists/linux-scsi/msg37168.html

Reported-by: zhou sf
Tested-by: zhou sf
Cc: stable@kernel.org
Signed-off-by: FUJITA Tomonori
Signed-off-by: Jens Axboe

FUJITA Tomonori
2009-07-11 02:31:53 +0800
8aa7e847d Fix congestion_wait() sync/async vs read/write confusion ... Browse Code »

Commit 1faa16d22877f4839bd433547d770c676d1d964c accidentally broke
the bdi congestion wait queue logic, causing us to wait on congestion
for WRITE (== 1) when we really wanted BLK_RW_ASYNC (== 0) instead.

Signed-off-by: Jens Axboe

Jens Axboe
2009-07-11 02:31:53 +0800

01 Jul, 2009

1 commit

018e04468 block: get rid of queue-private command filter ... Browse Code »

The initial patches to support this through sysfs export were broken
and have been if 0'ed out in any release. So lets just kill the code
and reclaim some space in struct request_queue, if anyone would later
like to fixup the sysfs bits, the git history can easily restore
the removed bits.

Signed-off-by: Jens Axboe

Jens Axboe
2009-07-01 16:56:26 +0800

16 Jun, 2009

1 commit

e475bba2f block: Introduce helper to reset queue limits to default values ... Browse Code »

DM reuses the request queue when swapping in a new device table
Introduce blk_set_default_limits() which can be used to reset the the
queue_limits prior to stacking devices.

Signed-off-by: Martin K. Petersen
Acked-by: Alasdair G Kergon
Acked-by: Mike Snitzer
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-06-16 14:23:52 +0800

13 Jun, 2009

1 commit

d614aec47 Merge branch 'for-2.6.31' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 ... Browse Code »

* 'for-2.6.31' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (29 commits)
ide: re-implement ide_pci_init_one() on top of ide_pci_init_two()
ide: unexport ide_find_dma_mode()
ide: fix PowerMac bootup oops
ide: skip probe if there are no devices on the port (v2)
sl82c105: add printk() logging facility
ide-tape: fix proc warning
ide: add IDE_DFLAG_NIEN_QUIRK device flag
ide: respect quirk_drives[] list on all controllers
hpt366: enable all quirks for devices on quirk_drives[] list
hpt366: sync quirk_drives[] list with pdc202xx_{new,old}.c
ide: remove superfluous SELECT_MASK() call from do_rw_taskfile()
ide: remove superfluous SELECT_MASK() call from ide_driveid_update()
icside: remove superfluous ->maskproc method
ide-tape: fix IDE_AFLAG_* atomic accesses
ide-tape: change IDE_AFLAG_IGNORE_DSC non-atomically
pdc202xx_old: kill resetproc() method
pdc202xx_old: don't call pdc202xx_reset() on IRQ timeout
pdc202xx_old: use ide_dma_test_irq()
ide: preserve Host Protected Area by default (v2)
ide-gd: implement block device ->set_capacity method (v2)
...

Linus Torvalds
2009-06-13 00:29:42 +0800

11 Jun, 2009

1 commit

b0fd271d5 block: add request clone interface (v2) ... Browse Code »

This patch adds the following 2 interfaces for request-stacking drivers:

- blk_rq_prep_clone(struct request *clone, struct request *orig,
struct bio_set *bs, gfp_t gfp_mask,
int (*bio_ctr)(struct bio *, struct bio*, void *),
void *data)
* Clones bios in the original request to the clone request
(bio_ctr is called for each cloned bios.)
* Copies attributes of the original request to the clone request.
The actual data parts (e.g. ->cmd, ->buffer, ->sense) are not
copied.

- blk_rq_unprep_clone(struct request *clone)
* Frees cloned bios from the clone request.

Request stacking drivers (e.g. request-based dm) need to make a clone
request for a submitted request and dispatch it to other devices.

To allocate request for the clone, request stacking drivers may not
be able to use blk_get_request() because the allocation may be done
in an irq-disabled context.
So blk_rq_prep_clone() takes a request allocated by the caller
as an argument.

For each clone bio in the clone request, request stacking drivers
should be able to set up their own completion handler.
So blk_rq_prep_clone() takes a callback function which is called
for each clone bio, and a pointer for private data which is passed
to the callback.

NOTE:
blk_rq_prep_clone() doesn't copy any actual data of the original
request. Pages are shared between original bios and cloned bios.
So caller must not complete the original request before the clone
request.

Signed-off-by: Kiyoshi Ueda
Signed-off-by: Jun'ichi Nomura
Cc: Boaz Harrosh
Signed-off-by: Jens Axboe

Kiyoshi Ueda
2009-06-11 19:11:05 +0800

09 Jun, 2009

1 commit

9df1bb9b5 Revert "block: Fix bounce limit setting in DM" ... Browse Code »

This reverts commit a05c0205ba031c01bba33a21bf0a35920eb64833.

DM doesn't need to access the bounce_pfn directly.

Signed-off-by: Jens Axboe

Jens Axboe
2009-06-09 12:22:57 +0800

07 Jun, 2009

1 commit

db429e9ec partitions: add ->set_capacity block device method ... Browse Code »

* Add ->set_capacity block device method and use it in rescan_partitions()
to attempt enabling native capacity of the device upon detecting the
partition which exceeds device capacity.

* Add GENHD_FL_NATIVE_CAPACITY flag to try limit attempts of enabling
native capacity during partition scan.

Together with the consecutive patch implementing ->set_capacity method in
ide-gd device driver this allows automatic disabling of Host Protected Area
(HPA) if any partitions overlapping HPA are detected.

Cc: Robert Hancock
Cc: Frans Pop
Cc: "Andries E. Brouwer"
Acked-by: Al Viro
Emphatically-Acked-by: Alan Cox
Signed-off-by: Bartlomiej Zolnierkiewicz

Bartlomiej Zolnierkiewicz
2009-06-07 19:52:52 +0800

03 Jun, 2009

1 commit

a05c0205b block: Fix bounce limit setting in DM ... Browse Code »

blk_queue_bounce_limit() is more than a wrapper about the request queue
limits.bounce_pfn variable. Introduce blk_queue_bounce_pfn() which can
be called by stacking drivers that wish to set the bounce limit
explicitly.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-06-03 15:33:18 +0800

23 May, 2009

5 commits

c72758f33 block: Export I/O topology for block devices and partitions ... Browse Code »

To support devices with physical block sizes bigger than 512 bytes we
need to ensure proper alignment. This patch adds support for exposing
I/O topology characteristics as devices are stacked.

logical_block_size is the smallest unit the device can address.

physical_block_size indicates the smallest I/O the device can write
without incurring a read-modify-write penalty.

The io_min parameter is the smallest preferred I/O size reported by
the device. In many cases this is the same as the physical block
size. However, the io_min parameter can be scaled up when stacking
(RAID5 chunk size > physical block size).

The io_opt characteristic indicates the optimal I/O size reported by
the device. This is usually the stripe width for arrays.

The alignment_offset parameter indicates the number of bytes the start
of the device/partition is offset from the device's natural alignment.
Partition tools and MD/DM utilities can use this to pad their offsets
so filesystems start on proper boundaries.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-05-23 05:22:55 +0800
025146e13 block: Move queue limits to an embedded struct ... Browse Code »

To accommodate stacking drivers that do not have an associated request
queue we're moving the limits to a separate, embedded structure.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-05-23 05:22:55 +0800
ae03bf639 block: Use accessor functions for queue limits ... Browse Code »

Convert all external users of queue limits to using wrapper functions
instead of poking the request queue variables directly.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-05-23 05:22:54 +0800
e1defc4ff block: Do away with the notion of hardsect_size ... Browse Code »

Until now we have had a 1:1 mapping between storage device physical
block size and the logical block sized used when addressing the device.
With SATA 4KB drives coming out that will no longer be the case. The
sector size will be 4KB but the logical block size will remain
512-bytes. Hence we need to distinguish between the physical block size
and the logical ditto.

This patch renames hardsect_size to logical_block_size.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-05-23 05:22:54 +0800
e4b636366 Merge branch 'master' into for-2.6.31 ... Browse Code »

Conflicts:
drivers/block/hd.c
drivers/block/mg_disk.c

Signed-off-by: Jens Axboe

Jens Axboe
2009-05-23 02:25:34 +0800

20 May, 2009

1 commit

0a7ae2ff0 block: change the tag sync vs async restriction logic ... Browse Code »

Make them fully share the tag space, but disallow async requests using
the last any two slots.

Signed-off-by: Jens Axboe

Jens Axboe
2009-05-20 14:54:31 +0800

19 May, 2009

2 commits

a411f4bbb block: Un-export blk_rq_append_bio ... Browse Code »

OSD was the last in-tree user of blk_rq_append_bio(). Now
that it is fixed blk_rq_append_bio is un-exported and
is only used internally by block layer.

Signed-off-by: Boaz Harrosh
Signed-off-by: Jens Axboe

Boaz Harrosh
2009-05-19 18:14:56 +0800
79eb63e9e block: Add blk_make_request(), takes bio, returns a request ... Browse Code »

New block API:
given a struct bio allocates a new request. This is the parallel of
generic_make_request for BLOCK_PC commands users.

The passed bio may be a chained-bio. The bio is bounced if needed
inside the call to this member.

This is in the effort of un-exporting blk_rq_append_bio().

Signed-off-by: Boaz Harrosh
CC: Jeff Garzik
Signed-off-by: Jens Axboe

Boaz Harrosh
2009-05-19 18:14:56 +0800

11 May, 2009

7 commits

b1f744937 block: move completion related functions back to blk-core.c ... Browse Code »

Let's put the completion related functions back to block/blk-core.c
where they have lived. We can also unexport blk_end_bidi_request() and
__blk_end_bidi_request(), which nobody uses.

Signed-off-by: FUJITA Tomonori
Signed-off-by: Jens Axboe

FUJITA Tomonori
2009-05-11 17:06:48 +0800
1822952ba block: let blk_end_request_all handle bidi requests ... Browse Code »

blk_end_request_all() and __blk_end_request_all() should finish all
bytes including bidi, by definition. That's what all bidi users need ,
bidi requests must be complete as a whole (partial completion is
impossible).

Signed-off-by: FUJITA Tomonori
Signed-off-by: Jens Axboe

FUJITA Tomonori
2009-05-11 17:06:47 +0800
9934c8c04 block: implement and enforce request peek/start/fetch ... Browse Code »

Till now block layer allowed two separate modes of request execution.
A request is always acquired from the request queue via
elv_next_request(). After that, drivers are free to either dequeue it
or process it without dequeueing. Dequeue allows elv_next_request()
to return the next request so that multiple requests can be in flight.

Executing requests without dequeueing has its merits mostly in
allowing drivers for simpler devices which can't do sg to deal with
segments only without considering request boundary. However, the
benefit this brings is dubious and declining while the cost of the API
ambiguity is increasing. Segment based drivers are usually for very
old or limited devices and as converting to dequeueing model isn't
difficult, it doesn't justify the API overhead it puts on block layer
and its more modern users.

Previous patches converted all block low level drivers to dequeueing
model. This patch completes the API transition by...

* renaming elv_next_request() to blk_peek_request()

* renaming blkdev_dequeue_request() to blk_start_request()

* adding blk_fetch_request() which is combination of peek and start

* disallowing completion of queued (not started) requests

* applying new API to all LLDs

Renamings are for consistency and to break out of tree code so that
it's apparent that out of tree drivers need updating.

[ Impact: block request issue API cleanup, no functional change ]

Signed-off-by: Tejun Heo
Cc: Rusty Russell
Cc: James Bottomley
Cc: Mike Miller
Cc: unsik Kim
Cc: Paul Clements
Cc: Tim Waugh
Cc: Geert Uytterhoeven
Cc: David S. Miller
Cc: Laurent Vivier
Cc: Jeff Garzik
Cc: Jeremy Fitzhardinge
Cc: Grant Likely
Cc: Adrian McMenamin
Cc: Stephen Rothwell
Cc: Bartlomiej Zolnierkiewicz
Cc: Borislav Petkov
Cc: Sergei Shtylyov
Cc: Alex Dubov
Cc: Pierre Ossman
Cc: David Woodhouse
Cc: Markus Lidel
Cc: Stefan Weinhuber
Cc: Martin Schwidefsky
Cc: Pete Zaitcev
Cc: FUJITA Tomonori
Signed-off-by: Jens Axboe

Tejun Heo
2009-05-11 15:52:18 +0800
a2dec7b36 block: hide request sector and data_len ... Browse Code »

Block low level drivers for some reason have been pretty good at
abusing block layer API. Especially struct request's fields tend to
get violated in all possible ways. Make it clear that low level
drivers MUST NOT access or manipulate rq->sector and rq->data_len
directly by prefixing them with double underscores.

This change is also necessary to break build of out-of-tree codes
which assume the previous block API where internal fields can be
manipulated and rq->data_len carries residual count on completion.

[ Impact: hide internal fields, block API change ]

Signed-off-by: Tejun Heo
Signed-off-by: Jens Axboe

Tejun Heo
2009-05-11 15:50:55 +0800
2e46e8b27 block: drop request->hard_* and *nr_sectors ... Browse Code »

struct request has had a few different ways to represent some
properties of a request. ->hard_* represent block layer's view of the
request progress (completion cursor) and the ones without the prefix
are supposed to represent the issue cursor and allowed to be updated
as necessary by the low level drivers. The thing is that as block
layer supports partial completion, the two cursors really aren't
necessary and only cause confusion. In addition, manual management of
request detail from low level drivers is cumbersome and error-prone at
the very least.

Another interesting duplicate fields are rq->[hard_]nr_sectors and
rq->{hard_cur|current}_nr_sectors against rq->data_len and
rq->bio->bi_size. This is more convoluted than the hard_ case.

rq->[hard_]nr_sectors are initialized for requests with bio but
blk_rq_bytes() uses it only for !pc requests. rq->data_len is
initialized for all request but blk_rq_bytes() uses it only for pc
requests. This causes good amount of confusion throughout block layer
and its drivers and determining the request length has been a bit of
black magic which may or may not work depending on circumstances and
what the specific LLD is actually doing.

rq->{hard_cur|current}_nr_sectors represent the number of sectors in
the contiguous data area at the front. This is mainly used by drivers
which transfers data by walking request segment-by-segment. This
value always equals rq->bio->bi_size >> 9. However, data length for
pc requests may not be multiple of 512 bytes and using this field
becomes a bit confusing.

In general, having multiple fields to represent the same property
leads only to confusion and subtle bugs. With recent block low level
driver cleanups, no driver is accessing or manipulating these
duplicate fields directly. Drop all the duplicates. Now rq->sector
means the current sector, rq->data_len the current total length and
rq->bio->bi_size the current segment length. Everything else is
defined in terms of these three and available only through accessors.

* blk_recalc_rq_sectors() is collapsed into blk_update_request() and
now handles pc and fs requests equally other than rq->sector update.
This means that now pc requests can use partial completion too (no
in-kernel user yet tho).

* bio_cur_sectors() is replaced with bio_cur_bytes() as block layer
now uses byte count as the primary data length.

* blk_rq_pos() is now guranteed to be always correct. In-block users
converted.

* blk_rq_bytes() is now guaranteed to be always valid as is
blk_rq_sectors(). In-block users converted.

* blk_rq_sectors() is now guaranteed to equal blk_rq_bytes() >> 9.
More convenient one is used.

* blk_rq_bytes() and blk_rq_cur_bytes() are now inlined and take const
pointer to request.

[ Impact: API cleanup, single way to represent one property of a request ]

Signed-off-by: Tejun Heo
Cc: Boaz Harrosh
Signed-off-by: Jens Axboe

Tejun Heo
2009-05-11 15:50:54 +0800
5b93629b4 block: implement blk_rq_pos/[cur_]sectors() and convert obvious ones ... Browse Code »

Implement accessors - blk_rq_pos(), blk_rq_sectors() and
blk_rq_cur_sectors() which return rq->hard_sector, rq->hard_nr_sectors
and rq->hard_cur_sectors respectively and convert direct references of
the said fields to the accessors.

This is in preparation of request data length handling cleanup.

Geert : suggested adding const to struct request * parameter to accessors
Sergei : spotted error in patch description

[ Impact: cleanup ]

Signed-off-by: Tejun Heo
Acked-by: Geert Uytterhoeven
Acked-by: Stephen Rothwell
Tested-by: Grant Likely
Acked-by: Grant Likely
Ackec-by: Sergei Shtylyov
Cc: Bartlomiej Zolnierkiewicz
Cc: Borislav Petkov
Cc: James Bottomley
Signed-off-by: Jens Axboe

Tejun Heo
2009-05-11 15:50:53 +0800
c3a4d78c5 block: add rq->resid_len ... Browse Code »

rq->data_len served two purposes - the length of data buffer on issue
and the residual count on completion. This duality creates some
headaches.

First of all, block layer and low level drivers can't really determine
what rq->data_len contains while a request is executing. It could be
the total request length or it coulde be anything else one of the
lower layers is using to keep track of residual count. This
complicates things because blk_rq_bytes() and thus
[__]blk_end_request_all() relies on rq->data_len for PC commands.
Drivers which want to report residual count should first cache the
total request length, update rq->data_len and then complete the
request with the cached data length.

Secondly, it makes requests default to reporting full residual count,
ie. reporting that no data transfer occurred. The residual count is
an exception not the norm; however, the driver should clear
rq->data_len to zero to signify the normal cases while leaving it
alone means no data transfer occurred at all. This reverse default
behavior complicates code unnecessarily and renders block PC on some
drivers (ide-tape/floppy) unuseable.

This patch adds rq->resid_len which is used only for residual count.

While at it, remove now unnecessasry blk_rq_bytes() caching in
ide_pc_intr() as rq->data_len is not changed anymore.

Boaz : spotted missing conversion in osd
Sergei : spotted too early conversion to blk_rq_bytes() in ide-tape

[ Impact: cleanup residual count handling, report 0 resid by default ]

Signed-off-by: Tejun Heo
Cc: James Bottomley
Cc: Bartlomiej Zolnierkiewicz
Cc: Borislav Petkov
Cc: Sergei Shtylyov
Cc: Mike Miller
Cc: Eric Moore
Cc: Alan Stern
Cc: FUJITA Tomonori
Cc: Doug Gilbert
Cc: Mike Miller
Cc: Eric Moore
Cc: Darrick J. Wong
Cc: Pete Zaitcev
Cc: Boaz Harrosh
Signed-off-by: Jens Axboe

Tejun Heo
2009-05-11 15:50:53 +0800

03 May, 2009

1 commit

7b39da786 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
ide-cd: fix REQ_QUIET tests in cdrom_decode_status

Fix up trivial conflicts in include/linux/blkdev.h

Linus Torvalds
2009-05-03 07:48:32 +0800

01 May, 2009

1 commit

96c167439 ide-cd: fix REQ_QUIET tests in cdrom_decode_status ... Browse Code »

Original patch (dfa4411cc3a690011cab90e9a536938795366cf9) was buggy.
This is a more proper fix which introduces blk_rq_quiet() macro
alleviating the need for dumb, too short caching variables.

Thanks to Helge Deller and Bart for debugging this.

Signed-off-by: Borislav Petkov
Cc: Jens Axboe
Cc: Sergei Shtylyov
Reported-and-tested-by: Helge Deller
Signed-off-by: Bartlomiej Zolnierkiewicz

Borislav Petkov
2009-05-01 00:24:34 +0800