Eric Lee / smarc-fsl-linux-kernel

12 Jun, 2009

3 commits

8ebf97560 block: fix kernel-doc in recent block/ changes ... Browse Code »

Fix kernel-doc warnings in recently changed block/ source code.

Signed-off-by: Randy Dunlap
Signed-off-by: Linus Torvalds

Randy Dunlap
2009-06-12 11:14:23 +0800
c9059598e Merge branch 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block ... Browse Code »

* 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block: (153 commits)
block: add request clone interface (v2)
floppy: fix hibernation
ramdisk: remove long-deprecated "ramdisk=" boot-time parameter
fs/bio.c: add missing __user annotation
block: prevent possible io_context->refcount overflow
Add serial number support for virtio_blk, V4a
block: Add missing bounce_pfn stacking and fix comments
Revert "block: Fix bounce limit setting in DM"
cciss: decode unit attention in SCSI error handling code
cciss: Remove no longer needed sendcmd reject processing code
cciss: change SCSI error handling routines to work with interrupts enabled.
cciss: separate error processing and command retrying code in sendcmd_withirq_core()
cciss: factor out fix target status processing code from sendcmd functions
cciss: simplify interface of sendcmd() and sendcmd_withirq()
cciss: factor out core of sendcmd_withirq() for use by SCSI error handling code
cciss: Use schedule_timeout_uninterruptible in SCSI error handling code
block: needs to set the residual length of a bidi request
Revert "block: implement blkdev_readpages"
block: Fix bounce limit setting in DM
Removed reference to non-existing file Documentation/PCI/PCI-DMA-mapping.txt
...

Manually fix conflicts with tracing updates in:
block/blk-sysfs.c
drivers/ide/ide-atapi.c
drivers/ide/ide-cd.c
drivers/ide/ide-floppy.c
drivers/ide/ide-tape.c
include/trace/events/block.h
kernel/trace/blktrace.c

Linus Torvalds
2009-06-12 02:10:35 +0800
27951daa7 Merge branch 'for-2.6.31' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 ... Browse Code »

* 'for-2.6.31' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (28 commits)
ide-tape: fix debug call
alim15x3: Remove historical hacks, re-enable init_hwif for PowerPC
ide-dma: don't reset request fields on dma_timeout_retry()
ide: drop rq->data handling from ide_map_sg()
ide-atapi: kill unused fields and callbacks
ide-tape: simplify read/write functions
ide-tape: use byte size instead of sectors on rw issue functions
ide-tape: unify r/w init paths
ide-tape: kill idetape_bh
ide-tape: use standard data transfer mechanism
ide-tape: use single continuous buffer
ide-atapi,tape,floppy: allow ->pc_callback() to change rq->data_len
ide-tape,floppy: fix failed command completion after request sense
ide-pm: don't abuse rq->data
ide-cd,atapi: use bio for internal commands
ide-atapi: convert ide-{floppy,tape} to using preallocated sense buffer
ide-cd: convert to using generic sense request
ide: add helpers for preparing sense requests
ide-cd: don't abuse rq->buffer
ide-atapi: don't abuse rq->buffer
...

Linus Torvalds
2009-06-12 01:00:03 +0800

11 Jun, 2009

3 commits

b0fd271d5 block: add request clone interface (v2) ... Browse Code »

This patch adds the following 2 interfaces for request-stacking drivers:

- blk_rq_prep_clone(struct request *clone, struct request *orig,
struct bio_set *bs, gfp_t gfp_mask,
int (*bio_ctr)(struct bio *, struct bio*, void *),
void *data)
* Clones bios in the original request to the clone request
(bio_ctr is called for each cloned bios.)
* Copies attributes of the original request to the clone request.
The actual data parts (e.g. ->cmd, ->buffer, ->sense) are not
copied.

- blk_rq_unprep_clone(struct request *clone)
* Frees cloned bios from the clone request.

Request stacking drivers (e.g. request-based dm) need to make a clone
request for a submitted request and dispatch it to other devices.

To allocate request for the clone, request stacking drivers may not
be able to use blk_get_request() because the allocation may be done
in an irq-disabled context.
So blk_rq_prep_clone() takes a request allocated by the caller
as an argument.

For each clone bio in the clone request, request stacking drivers
should be able to set up their own completion handler.
So blk_rq_prep_clone() takes a callback function which is called
for each clone bio, and a pointer for private data which is passed
to the callback.

NOTE:
blk_rq_prep_clone() doesn't copy any actual data of the original
request. Pages are shared between original bios and cloned bios.
So caller must not complete the original request before the clone
request.

Signed-off-by: Kiyoshi Ueda
Signed-off-by: Jun'ichi Nomura
Cc: Boaz Harrosh
Signed-off-by: Jens Axboe

Kiyoshi Ueda
2009-06-11 19:11:05 +0800
862366118 Merge branch 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip ... Browse Code »

* 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (244 commits)
Revert "x86, bts: reenable ptrace branch trace support"
tracing: do not translate event helper macros in print format
ftrace/documentation: fix typo in function grapher name
tracing/events: convert block trace points to TRACE_EVENT(), fix !CONFIG_BLOCK
tracing: add protection around module events unload
tracing: add trace_seq_vprint interface
tracing: fix the block trace points print size
tracing/events: convert block trace points to TRACE_EVENT()
ring-buffer: fix ret in rb_add_time_stamp
ring-buffer: pass in lockdep class key for reader_lock
tracing: add annotation to what type of stack trace is recorded
tracing: fix multiple use of __print_flags and __print_symbolic
tracing/events: fix output format of user stack
tracing/events: fix output format of kernel stack
tracing/trace_stack: fix the number of entries in the header
ring-buffer: discard timestamps that are at the start of the buffer
ring-buffer: try to discard unneeded timestamps
ring-buffer: fix bug in ring_buffer_discard_commit
ftrace: do not profile functions when disabled
tracing: make trace pipe recognize latency format flag
...

Linus Torvalds
2009-06-11 10:53:40 +0800
d9c7d394a block: prevent possible io_context->refcount overflow ... Browse Code »

Currently io_context has an atomic_t(32-bit) as refcount. In the case of
cfq, for each device against whcih a task does I/O, a reference to the
io_context would be taken. And when there are multiple process sharing
io_contexts(CLONE_IO) would also have a reference to the same io_context.

Theoretically the possible maximum number of processes sharing the same
io_context + the number of disks/cfq_data referring to the same io_context
can overflow the 32-bit counter on a very high-end machine.

Even though it is an improbable case, let us make it atomic_long_t.

Signed-off-by: Nikanth Karthikesan
Signed-off-by: Andrew Morton
Signed-off-by: Jens Axboe

Nikanth Karthikesan
2009-06-11 05:07:15 +0800

10 Jun, 2009

1 commit

55782138e tracing/events: convert block trace points to TRACE_EVENT() ... Browse Code »

TRACE_EVENT is a more generic way to define tracepoints. Doing so adds
these new capabilities to this tracepoint:

- zero-copy and per-cpu splice() tracing
- binary tracing without printf overhead
- structured logging records exposed under /debug/tracing/events
- trace events embedded in function tracer output and other plugins
- user-defined, per tracepoint filter expressions
...

Cons:

- no dev_t info for the output of plug, unplug_timer and unplug_io events.
no dev_t info for getrq and sleeprq events if bio == NULL.
no dev_t info for rq_abort,...,rq_requeue events if rq->rq_disk == NULL.

This is mainly because we can't get the deivce from a request queue.
But this may change in the future.

- A packet command is converted to a string in TP_assign, not TP_print.
While blktrace do the convertion just before output.

Since pc requests should be rather rare, this is not a big issue.

- In blktrace, an event can have 2 different print formats, but a TRACE_EVENT
has a unique format, which means we have some unused data in a trace entry.

The overhead is minimized by using __dynamic_array() instead of __array().

I've benchmarked the ioctl blktrace vs the splice based TRACE_EVENT tracing:

dd dd + ioctl blktrace dd + TRACE_EVENT (splice)
1 7.36s, 42.7 MB/s 7.50s, 42.0 MB/s 7.41s, 42.5 MB/s
2 7.43s, 42.3 MB/s 7.48s, 42.1 MB/s 7.43s, 42.4 MB/s
3 7.38s, 42.6 MB/s 7.45s, 42.2 MB/s 7.41s, 42.5 MB/s

So the overhead of tracing is very small, and no regression when using
those trace events vs blktrace.

And the binary output of TRACE_EVENT is much smaller than blktrace:

# ls -l -h
-rw-r--r-- 1 root root 8.8M 06-09 13:24 sda.blktrace.0
-rw-r--r-- 1 root root 195K 06-09 13:24 sda.blktrace.1
-rw-r--r-- 1 root root 2.7M 06-09 13:25 trace_splice.out

Following are some comparisons between TRACE_EVENT and blktrace:

plug:
kjournald-480 [000] 303.084981: block_plug: [kjournald]
kjournald-480 [000] 303.084981: 8,0 P N [kjournald]

unplug_io:
kblockd/0-118 [000] 300.052973: block_unplug_io: [kblockd/0] 1
kblockd/0-118 [000] 300.052974: 8,0 U N [kblockd/0] 1

remap:
kjournald-480 [000] 303.085042: block_remap: 8,0 W 102736992 + 8 v3:

- use the newly introduced __dynamic_array().

Changelog from v1 -> v2:

- use __string() instead of __array() to minimize the memory required
to store hex dump of rq->cmd().

- support large pc requests.

- add missing blk_fill_rwbs_rq() in block_rq_requeue TRACE_EVENT.

- some cleanups.

Signed-off-by: Li Zefan
LKML-Reference:
Signed-off-by: Steven Rostedt

Li Zefan
2009-06-10 00:34:23 +0800

09 Jun, 2009

4 commits

c1d4c41f2 bsg: setting rq->bio to NULL ... Browse Code »

Due to commit 1cd96c242a829d52f7a5ae98f554ca9775429685 ("block: WARN
in __blk_put_request() for potential bio leak"), BSG SMP requests get
the false warnings:

WARNING: at block/blk-core.c:1068 __blk_put_request+0x52/0xc0()

This sets rq->bio to NULL to avoid that false warnings.

Signed-off-by: FUJITA Tomonori
Signed-off-by: Jens Axboe

FUJITA Tomonori
2009-06-09 21:17:37 +0800
77634f33d block: Add missing bounce_pfn stacking and fix comments ... Browse Code »

DM no longer needs to set limits explicitly when calling blk_stack_limits.
Let the latter automatically deal with bounce_pfn scaling.

Fix kerneldoc variable names.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-06-09 12:23:22 +0800
9df1bb9b5 Revert "block: Fix bounce limit setting in DM" ... Browse Code »

This reverts commit a05c0205ba031c01bba33a21bf0a35920eb64833.

DM doesn't need to access the bounce_pfn directly.

Signed-off-by: Jens Axboe

Jens Axboe
2009-06-09 12:22:57 +0800
dbb66c4be block: needs to set the residual length of a bidi request ... Browse Code »

Tejun's "block: set rq->resid_len to blk_rq_bytes() on issue" patch
seems to be incomplete; It doesn't set rq->resid_len to blk_rq_bytes()
for a bidi request (req->next_rq). As a result, all bidi users are
broken.

Signed-off-by: FUJITA Tomonori
Acked-by: Tejun Heo
Signed-off-by: Jens Axboe

FUJITA Tomonori
2009-06-09 11:47:10 +0800

03 Jun, 2009

1 commit

a05c0205b block: Fix bounce limit setting in DM ... Browse Code »

blk_queue_bounce_limit() is more than a wrapper about the request queue
limits.bounce_pfn variable. Introduce blk_queue_bounce_pfn() which can
be called by stacking drivers that wish to set the bounce limit
explicitly.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-06-03 15:33:18 +0800

02 Jun, 2009

1 commit

53c663ce0 block: fix a possible oops on elv_abort_queue() ... Browse Code »

I found one more mis-conversion to the 'request is always dequeued
when completing' model in elv_abort_queue() during code inspection.
Although I haven't hit any problem caused by this mis-conversion yet
and just done compile/boot test, please apply if you have no problem.

Request must be dequeued when it completes.
However, elv_abort_queue() completes requests without dequeueing.
This will cause oops in the __blk_end_request_all().
This patch fixes the oops.

Signed-off-by: Kiyoshi Ueda
Signed-off-by: Jun'ichi Nomura
Signed-off-by: Jens Axboe

Kiyoshi Ueda
2009-06-02 14:44:01 +0800

30 May, 2009

1 commit

c143dc903 block: fix an oops on BLKPREP_KILL ... Browse Code »

Doing a bit of torture testing, I ran across a BUG in the block
subsystem (at blk-core.c:2048): the test for if the request is queued.

It turns out the trigger was a BLKPREP_KILL coming out of the SCSI prep
function. Currently for BLKPREP_KILL requests, we send them straight
into __blk_end_request_all() with an error, but they've never been
dequeued, so they trip the bug. Fix this by starting requests before
killing them.

Signed-off-by: James Bottomley
Signed-off-by: Jens Axboe

James Bottomley
2009-05-30 12:43:49 +0800

28 May, 2009

1 commit

5d85d3247 block: export blk_stack_limits() ... Browse Code »

DM needs to use blk_stack_limits(), so it needs to be exported.

Acked-by: Martin K. Petersen

Signed-off-by: Jens Axboe

Mike Snitzer
2009-05-28 17:04:53 +0800

27 May, 2009

2 commits

3c4198e87 block: fix no diskstat problem ... Browse Code »

The commit below in 2.6-block/for-2.6.31 causes no diskstat problem
because the blk_discard_rq() check was added with '&&'.
It should be 'blk_fs_request() || blk_discard_rq()'.
This patch does it and fixes the no diskstat problem.
Please review and apply.

------ /proc/diskstat without this patch -------------------------------------
8 0 sda 0 0 0 0 0 0 0 0 0 0 0
------------------------------------------------------------------------------

----- /proc/diskstat with this patch applied ---------------------------------
8 0 sda 4186 303 373621 61600 9578 3859 107468 169479 2 89755 231059
------------------------------------------------------------------------------

--------------------------------------------------------------------------
commit c69d48540c201394d08cb4d48b905e001313d9b8
Author: Jens Axboe
Date: Fri Apr 24 08:12:19 2009 +0200

block: include discard requests in IO accounting

We currently don't do merging on discard requests, but we potentially
could. If we do, then we need to include discard requests in the IO
accounting, or merging would end up decrementing in_flight IO counters
for an IO which never incremented them.

So enable accounting for discard requests.

static inline int blk_do_io_stat(struct request *rq)
{
- return rq->rq_disk && blk_rq_io_stat(rq) && blk_fs_request(rq);
+ return rq->rq_disk && blk_rq_io_stat(rq) && blk_fs_request(rq) &&
+ blk_discard_rq(rq);
}
--------------------------------------------------------------------------

Signed-off-by: Kiyoshi Ueda
Signed-off-by: Jun'ichi Nomura
Signed-off-by: Jens Axboe

Kiyoshi Ueda
2009-05-27 20:50:02 +0800
ba396a6c1 block: fix oops with block tag queueing ... Browse Code »

commit e8939a50466fd963eb1ba9118c34b9ffb7ff6aa6
Author: Tejun Heo
Date: Fri May 8 11:54:16 2009 +0900

block: implement and enforce request peek/start/fetch

Added a BUG_ON(blk_queued_rq(req)) to the top of blk_finish_req().
Unfortunately, this checks whether req->queuelist is empty. This list
is doing double duty both as the queue list and the tag list, so tagged
requests come in here with this not empty and boom (the tag list is
emptied by blk_queue_end_tag() lower down).

Fix this by moving the BUG_ON to below the end tag we also seem
vulnerable to this in blk_requeue_request() as well. I think all uses
of blk_queued_rq() need auditing because the check is clearly wrong in
the tagged case.

Signed-off-by: James Bottomley
Signed-off-by: Jens Axboe

James Bottomley
2009-05-27 20:17:08 +0800

23 May, 2009

6 commits

c72758f33 block: Export I/O topology for block devices and partitions ... Browse Code »

To support devices with physical block sizes bigger than 512 bytes we
need to ensure proper alignment. This patch adds support for exposing
I/O topology characteristics as devices are stacked.

logical_block_size is the smallest unit the device can address.

physical_block_size indicates the smallest I/O the device can write
without incurring a read-modify-write penalty.

The io_min parameter is the smallest preferred I/O size reported by
the device. In many cases this is the same as the physical block
size. However, the io_min parameter can be scaled up when stacking
(RAID5 chunk size > physical block size).

The io_opt characteristic indicates the optimal I/O size reported by
the device. This is usually the stripe width for arrays.

The alignment_offset parameter indicates the number of bytes the start
of the device/partition is offset from the device's natural alignment.
Partition tools and MD/DM utilities can use this to pad their offsets
so filesystems start on proper boundaries.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-05-23 05:22:55 +0800
cd43e26f0 block: Expose stacked device queues in sysfs ... Browse Code »

Currently stacking devices do not have a queue directory in sysfs.
However, many of the I/O characteristics like sector size, maximum
request size, etc. are queue properties.

This patch enables the queue directory for MD/DM devices. The elevator
code has been modified to deal with queues that do not have an I/O
scheduler.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-05-23 05:22:55 +0800
025146e13 block: Move queue limits to an embedded struct ... Browse Code »

To accommodate stacking drivers that do not have an associated request
queue we're moving the limits to a separate, embedded structure.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-05-23 05:22:55 +0800
ae03bf639 block: Use accessor functions for queue limits ... Browse Code »

Convert all external users of queue limits to using wrapper functions
instead of poking the request queue variables directly.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-05-23 05:22:54 +0800
e1defc4ff block: Do away with the notion of hardsect_size ... Browse Code »

Until now we have had a 1:1 mapping between storage device physical
block size and the logical block sized used when addressing the device.
With SATA 4KB drives coming out that will no longer be the case. The
sector size will be 4KB but the logical block size will remain
512-bytes. Hence we need to distinguish between the physical block size
and the logical ditto.

This patch renames hardsect_size to logical_block_size.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-05-23 05:22:54 +0800
e4b636366 Merge branch 'master' into for-2.6.31 ... Browse Code »

Conflicts:
drivers/block/hd.c
drivers/block/mg_disk.c

Signed-off-by: Jens Axboe

Jens Axboe
2009-05-23 02:25:34 +0800

20 May, 2009

2 commits

0a7ae2ff0 block: change the tag sync vs async restriction logic ... Browse Code »

Make them fully share the tag space, but disallow async requests using
the last any two slots.

Signed-off-by: Jens Axboe

Jens Axboe
2009-05-20 14:54:31 +0800
53674ac5a block: add warning to blk_make_request() ... Browse Code »

Add a note about how one needs to be careful when setting up these bio
chains.

Extracted from Boaz's updated patch.

Signed-off-by: Jens Axboe

Jens Axboe
2009-05-20 01:52:35 +0800

19 May, 2009

4 commits

a411f4bbb block: Un-export blk_rq_append_bio ... Browse Code »

OSD was the last in-tree user of blk_rq_append_bio(). Now
that it is fixed blk_rq_append_bio is un-exported and
is only used internally by block layer.

Signed-off-by: Boaz Harrosh
Signed-off-by: Jens Axboe

Boaz Harrosh
2009-05-19 18:14:56 +0800
79eb63e9e block: Add blk_make_request(), takes bio, returns a request ... Browse Code »

New block API:
given a struct bio allocates a new request. This is the parallel of
generic_make_request for BLOCK_PC commands users.

The passed bio may be a chained-bio. The bio is bounced if needed
inside the call to this member.

This is in the effort of un-exporting blk_rq_append_bio().

Signed-off-by: Boaz Harrosh
CC: Jeff Garzik
Signed-off-by: Jens Axboe

Boaz Harrosh
2009-05-19 18:14:56 +0800
3a5a39276 block: allow blk_rq_map_kern to append to requests ... Browse Code »

Use blk_rq_append_bio() internally instead of blk_rq_bio_prep()
so blk_rq_map_kern can be called multiple times, to map multiple
buffers.

This is in the effort to un-export blk_rq_append_bio()

Signed-off-by: James Bottomley
Signed-off-by: Boaz Harrosh
Signed-off-by: Jens Axboe

James Bottomley
2009-05-19 18:14:55 +0800
5f49f6317 block: set rq->resid_len to blk_rq_bytes() on issue ... Browse Code »

In commit c3a4d78c580de4edc9ef0f7c59812fb02ceb037f, while introducing
rq->resid_len, the default value of residue count was changed from
full count to zero. The conversion was done under the assumption that
when a request fails residue count wasn't defined. However, Boaz and
James pointed out that this wasn't true and the residue count should
be preserved for failed requests too.

This patchset restores the original behavior by setting rq->resid_len
to blk_rq_bytes(rq) on request start and restoring explicit clearing
in affected drivers. While at it, take advantage of the fact that
rq->resid_len is set to full count where applicable.

* ide-cd: rq->resid_len cleared on pc success

* mptsas: req->resid_len cleared on success

* sas_expander: rsp/req->resid_len cleared on success

* mpt2sas_transport: req->resid_len cleared on success

* ide-cd, ide-tape, mptsas, sas_host_smp, mpt2sas_transport, ub: take
advantage of initial full count to simplify code

Boaz Harrosh spotted bug in resid_len initialization. Fixed as
suggested.

Signed-off-by: Tejun Heo
Acked-by: Borislav Petkov
Cc: Boaz Harrosh
Cc: James Bottomley
Cc: Pete Zaitcev
Cc: Bartlomiej Zolnierkiewicz
Cc: Sergei Shtylyov
Cc: Eric Moore
Cc: Darrick J. Wong
Signed-off-by: Jens Axboe

Tejun Heo
2009-05-19 17:36:08 +0800

18 May, 2009

1 commit

1079cac0f Merge commit 'v2.6.30-rc6' into tracing/core ... Browse Code »

Merge reason: we were on an -rc4 base, sync up to -rc6

Signed-off-by: Ingo Molnar

Ingo Molnar
2009-05-18 16:15:35 +0800

12 May, 2009

1 commit

af498d7fa block: fix the bio_vec array index out-of-bounds test ... Browse Code »

Current bio_vec array index out-of-bounds test within
__end_that_request_first() does not seem correct.
It checks bio->bi_idx against bio->bi_vcnt, but the subsequent code
uses idx (which is, bio->bi_idx + next_idx) as the array index into
bio_vec array. This means that the test really make sense only at
the first iteration of !(nr_bytes >=bio->bi_size) case (when next_idx
== zero). Fix this by replacing bio->bi_idx with idx.
(This patch applies to 2.6.30-rc4.)

Signed-off-by: Kazuhisa Ichikawa
Signed-off-by: Jens Axboe

Kazuhisa Ichikawa
2009-05-12 19:27:45 +0800

11 May, 2009

7 commits

b1f744937 block: move completion related functions back to blk-core.c ... Browse Code »

Let's put the completion related functions back to block/blk-core.c
where they have lived. We can also unexport blk_end_bidi_request() and
__blk_end_bidi_request(), which nobody uses.

Signed-off-by: FUJITA Tomonori
Signed-off-by: Jens Axboe

FUJITA Tomonori
2009-05-11 17:06:48 +0800
9934c8c04 block: implement and enforce request peek/start/fetch ... Browse Code »

Till now block layer allowed two separate modes of request execution.
A request is always acquired from the request queue via
elv_next_request(). After that, drivers are free to either dequeue it
or process it without dequeueing. Dequeue allows elv_next_request()
to return the next request so that multiple requests can be in flight.

Executing requests without dequeueing has its merits mostly in
allowing drivers for simpler devices which can't do sg to deal with
segments only without considering request boundary. However, the
benefit this brings is dubious and declining while the cost of the API
ambiguity is increasing. Segment based drivers are usually for very
old or limited devices and as converting to dequeueing model isn't
difficult, it doesn't justify the API overhead it puts on block layer
and its more modern users.

Previous patches converted all block low level drivers to dequeueing
model. This patch completes the API transition by...

* renaming elv_next_request() to blk_peek_request()

* renaming blkdev_dequeue_request() to blk_start_request()

* adding blk_fetch_request() which is combination of peek and start

* disallowing completion of queued (not started) requests

* applying new API to all LLDs

Renamings are for consistency and to break out of tree code so that
it's apparent that out of tree drivers need updating.

[ Impact: block request issue API cleanup, no functional change ]

Signed-off-by: Tejun Heo
Cc: Rusty Russell
Cc: James Bottomley
Cc: Mike Miller
Cc: unsik Kim
Cc: Paul Clements
Cc: Tim Waugh
Cc: Geert Uytterhoeven
Cc: David S. Miller
Cc: Laurent Vivier
Cc: Jeff Garzik
Cc: Jeremy Fitzhardinge
Cc: Grant Likely
Cc: Adrian McMenamin
Cc: Stephen Rothwell
Cc: Bartlomiej Zolnierkiewicz
Cc: Borislav Petkov
Cc: Sergei Shtylyov
Cc: Alex Dubov
Cc: Pierre Ossman
Cc: David Woodhouse
Cc: Markus Lidel
Cc: Stefan Weinhuber
Cc: Martin Schwidefsky
Cc: Pete Zaitcev
Cc: FUJITA Tomonori
Signed-off-by: Jens Axboe

Tejun Heo
2009-05-11 15:52:18 +0800
a2dec7b36 block: hide request sector and data_len ... Browse Code »

Block low level drivers for some reason have been pretty good at
abusing block layer API. Especially struct request's fields tend to
get violated in all possible ways. Make it clear that low level
drivers MUST NOT access or manipulate rq->sector and rq->data_len
directly by prefixing them with double underscores.

This change is also necessary to break build of out-of-tree codes
which assume the previous block API where internal fields can be
manipulated and rq->data_len carries residual count on completion.

[ Impact: hide internal fields, block API change ]

Signed-off-by: Tejun Heo
Signed-off-by: Jens Axboe

Tejun Heo
2009-05-11 15:50:55 +0800
2e46e8b27 block: drop request->hard_* and *nr_sectors ... Browse Code »

struct request has had a few different ways to represent some
properties of a request. ->hard_* represent block layer's view of the
request progress (completion cursor) and the ones without the prefix
are supposed to represent the issue cursor and allowed to be updated
as necessary by the low level drivers. The thing is that as block
layer supports partial completion, the two cursors really aren't
necessary and only cause confusion. In addition, manual management of
request detail from low level drivers is cumbersome and error-prone at
the very least.

Another interesting duplicate fields are rq->[hard_]nr_sectors and
rq->{hard_cur|current}_nr_sectors against rq->data_len and
rq->bio->bi_size. This is more convoluted than the hard_ case.

rq->[hard_]nr_sectors are initialized for requests with bio but
blk_rq_bytes() uses it only for !pc requests. rq->data_len is
initialized for all request but blk_rq_bytes() uses it only for pc
requests. This causes good amount of confusion throughout block layer
and its drivers and determining the request length has been a bit of
black magic which may or may not work depending on circumstances and
what the specific LLD is actually doing.

rq->{hard_cur|current}_nr_sectors represent the number of sectors in
the contiguous data area at the front. This is mainly used by drivers
which transfers data by walking request segment-by-segment. This
value always equals rq->bio->bi_size >> 9. However, data length for
pc requests may not be multiple of 512 bytes and using this field
becomes a bit confusing.

In general, having multiple fields to represent the same property
leads only to confusion and subtle bugs. With recent block low level
driver cleanups, no driver is accessing or manipulating these
duplicate fields directly. Drop all the duplicates. Now rq->sector
means the current sector, rq->data_len the current total length and
rq->bio->bi_size the current segment length. Everything else is
defined in terms of these three and available only through accessors.

* blk_recalc_rq_sectors() is collapsed into blk_update_request() and
now handles pc and fs requests equally other than rq->sector update.
This means that now pc requests can use partial completion too (no
in-kernel user yet tho).

* bio_cur_sectors() is replaced with bio_cur_bytes() as block layer
now uses byte count as the primary data length.

* blk_rq_pos() is now guranteed to be always correct. In-block users
converted.

* blk_rq_bytes() is now guaranteed to be always valid as is
blk_rq_sectors(). In-block users converted.

* blk_rq_sectors() is now guaranteed to equal blk_rq_bytes() >> 9.
More convenient one is used.

* blk_rq_bytes() and blk_rq_cur_bytes() are now inlined and take const
pointer to request.

[ Impact: API cleanup, single way to represent one property of a request ]

Signed-off-by: Tejun Heo
Cc: Boaz Harrosh
Signed-off-by: Jens Axboe

Tejun Heo
2009-05-11 15:50:54 +0800
83096ebf1 block: convert to pos and nr_sectors accessors ... Browse Code »

With recent cleanups, there is no place where low level driver
directly manipulates request fields. This means that the 'hard'
request fields always equal the !hard fields. Convert all
rq->sectors, nr_sectors and current_nr_sectors references to
accessors.

While at it, drop superflous blk_rq_pos() < 0 test in swim.c.

[ Impact: use pos and nr_sectors accessors ]

Signed-off-by: Tejun Heo
Acked-by: Geert Uytterhoeven
Tested-by: Grant Likely
Acked-by: Grant Likely
Tested-by: Adrian McMenamin
Acked-by: Adrian McMenamin
Acked-by: Mike Miller
Cc: James Bottomley
Cc: Bartlomiej Zolnierkiewicz
Cc: Borislav Petkov
Cc: Sergei Shtylyov
Cc: Eric Moore
Cc: Alan Stern
Cc: FUJITA Tomonori
Cc: Pete Zaitcev
Cc: Stephen Rothwell
Cc: Paul Clements
Cc: Tim Waugh
Cc: Jeff Garzik
Cc: Jeremy Fitzhardinge
Cc: Alex Dubov
Cc: David Woodhouse
Cc: Martin Schwidefsky
Cc: Dario Ballabio
Cc: David S. Miller
Cc: Rusty Russell
Cc: unsik Kim
Cc: Laurent Vivier
Signed-off-by: Jens Axboe

Tejun Heo
2009-05-11 15:50:54 +0800
5b93629b4 block: implement blk_rq_pos/[cur_]sectors() and convert obvious ones ... Browse Code »

Implement accessors - blk_rq_pos(), blk_rq_sectors() and
blk_rq_cur_sectors() which return rq->hard_sector, rq->hard_nr_sectors
and rq->hard_cur_sectors respectively and convert direct references of
the said fields to the accessors.

This is in preparation of request data length handling cleanup.

Geert : suggested adding const to struct request * parameter to accessors
Sergei : spotted error in patch description

[ Impact: cleanup ]

Signed-off-by: Tejun Heo
Acked-by: Geert Uytterhoeven
Acked-by: Stephen Rothwell
Tested-by: Grant Likely
Acked-by: Grant Likely
Ackec-by: Sergei Shtylyov
Cc: Bartlomiej Zolnierkiewicz
Cc: Borislav Petkov
Cc: James Bottomley
Signed-off-by: Jens Axboe

Tejun Heo
2009-05-11 15:50:53 +0800
c3a4d78c5 block: add rq->resid_len ... Browse Code »

rq->data_len served two purposes - the length of data buffer on issue
and the residual count on completion. This duality creates some
headaches.

First of all, block layer and low level drivers can't really determine
what rq->data_len contains while a request is executing. It could be
the total request length or it coulde be anything else one of the
lower layers is using to keep track of residual count. This
complicates things because blk_rq_bytes() and thus
[__]blk_end_request_all() relies on rq->data_len for PC commands.
Drivers which want to report residual count should first cache the
total request length, update rq->data_len and then complete the
request with the cached data length.

Secondly, it makes requests default to reporting full residual count,
ie. reporting that no data transfer occurred. The residual count is
an exception not the norm; however, the driver should clear
rq->data_len to zero to signify the normal cases while leaving it
alone means no data transfer occurred at all. This reverse default
behavior complicates code unnecessarily and renders block PC on some
drivers (ide-tape/floppy) unuseable.

This patch adds rq->resid_len which is used only for residual count.

While at it, remove now unnecessasry blk_rq_bytes() caching in
ide_pc_intr() as rq->data_len is not changed anymore.

Boaz : spotted missing conversion in osd
Sergei : spotted too early conversion to blk_rq_bytes() in ide-tape

[ Impact: cleanup residual count handling, report 0 resid by default ]

Signed-off-by: Tejun Heo
Cc: James Bottomley
Cc: Bartlomiej Zolnierkiewicz
Cc: Borislav Petkov
Cc: Sergei Shtylyov
Cc: Mike Miller
Cc: Eric Moore
Cc: Alan Stern
Cc: FUJITA Tomonori
Cc: Doug Gilbert
Cc: Mike Miller
Cc: Eric Moore
Cc: Darrick J. Wong
Cc: Pete Zaitcev
Cc: Boaz Harrosh
Signed-off-by: Jens Axboe

Tejun Heo
2009-05-11 15:50:53 +0800

07 May, 2009

1 commit

44347d947 Merge branch 'linus' into tracing/core ... Browse Code »

Merge reason: tracing/core was on a .30-rc1 base and was missing out on
on a handful of tracing fixes present in .30-rc5-almost.

Signed-off-by: Ingo Molnar

Ingo Molnar
2009-05-07 17:17:34 +0800

06 May, 2009

1 commit

22a7c31a9 blktrace: from-sector redundant in trace_block_remap ... Browse Code »

Remove redundant from-sector parameter: it's /always/ the bio's sector
passed in.

[ Impact: cleanup ]

Signed-off-by: Alan D. Brunelle
Reviewed-by: Li Zefan
Reviewed-by: KOSAKI Motohiro
Cc: Jens Axboe
Cc: Arnaldo Carvalho de Melo
LKML-Reference:
Signed-off-by: Ingo Molnar

Alan D. Brunelle
2009-05-06 20:13:01 +0800