Eric Lee / linux-smarc-t335x-v3.2

23 Aug, 2010

1 commit

5dd531a03 block: add function call to switch the IO scheduler from a driver ... Browse Code »

Currently drivers must do an elevator_exit() + elevator_init()
to switch IO schedulers. There are a few problems with this:

- Since commit 1abec4fdbb142e3ccb6ce99832fae42129134a96,
elevator_init() requires a zeroed out q->elevator
pointer. The two existing in-kernel users don't do that.

- It will only work at initialization time, since using the
above two-staged construct does not properly quisce the queue.

So add elevator_change() which takes care of this, and convert
the elv_iosched_store() sysfs interface to use this helper as well.

Reported-by: Peter Oberparleiter
Reported-by: Kevin Vigor
Signed-off-by: Jens Axboe

Jens Axboe
2010-08-23 19:52:19 +0800

12 Aug, 2010

1 commit

8d57a98cc block: add secure discard ... Browse Code »

Secure discard is the same as discard except that all copies of the
discarded sectors (perhaps created by garbage collection) must also be
erased.

Signed-off-by: Adrian Hunter
Acked-by: Jens Axboe
Cc: Kyungmin Park
Cc: Madhusudhan Chikkature
Cc: Christoph Hellwig
Cc: Ben Gardiner
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Hunter
2010-08-12 23:43:30 +0800

08 Aug, 2010

2 commits

7b6d91dae block: unify flags for struct bio and struct request ... Browse Code »

Remove the current bio flags and reuse the request flags for the bio, too.
This allows to more easily trace the type of I/O from the filesystem
down to the block driver. There were two flags in the bio that were
missing in the requests: BIO_RW_UNPLUG and BIO_RW_AHEAD. Also I've
renamed two request flags that had a superflous RW in them.

Note that the flags are in bio.h despite having the REQ_ name - as
blkdev.h includes bio.h that is the only way to go for now.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2010-08-08 00:20:39 +0800
33659ebba block: remove wrappers for request type/flags ... Browse Code »

Remove all the trivial wrappers for the cmd_type and cmd_flags fields in
struct requests. This allows much easier grepping for different request
types instead of unwinding through macros.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2010-08-08 00:17:56 +0800

04 Jun, 2010

1 commit

1abec4fdb block: make blk_init_free_list and elevator_init idempotent ... Browse Code »

blk_init_allocated_queue_node may fail and the caller _could_ retry.
Accommodate the unlikely event that blk_init_allocated_queue_node is
called on an already initialized (possibly partially) request_queue.

Signed-off-by: Mike Snitzer
Signed-off-by: Jens Axboe

Mike Snitzer
2010-06-04 19:47:06 +0800

24 May, 2010

1 commit

e36f724b4 block: Adjust elv_iosched_show to return "none" for bio-based DM ... Browse Code »

Bio-based DM doesn't use an elevator (queue is !blk_queue_stackable()).

Longer-term DM will not allocate an elevator for bio-based DM. But even
then there will be small potential for an elevator to be allocated for
a request-based DM table only to have a bio-based table be loaded in the
end.

Displaying "none" for bio-based DM will help avoid user confusion.

Signed-off-by: Mike Snitzer
Signed-off-by: Jens Axboe

Mike Snitzer
2010-05-24 15:07:32 +0800

11 May, 2010

1 commit

01effb0dc block: allow initialization of previously allocated request_queue ... Browse Code »

blk_init_queue() allocates the request_queue structure and then
initializes it as needed (request_fn, elevator, etc).

Split initialization out to blk_init_allocated_queue_node.
Introduce blk_init_allocated_queue wrapper function to model existing
blk_init_queue and blk_init_queue_node interfaces.

Export elv_register_queue to allow a newly added elevator to be
registered with sysfs. Export elv_unregister_queue for symmetry.

These changes allow DM to initialize a device's request_queue with more
precision. In particular, DM no longer unconditionally initializes a
full request_queue (elevator et al). It only does so for a
request-based DM device.

Signed-off-by: Mike Snitzer
Signed-off-by: Jens Axboe

Mike Snitzer
2010-05-11 14:57:42 +0800

09 Apr, 2010

1 commit

812d40264 blkio: Add io_merged stat ... Browse Code »

This includes both the number of bios merged into requests belonging to this
cgroup as well as the number of requests merged together.
In the past, we've observed different merging behavior across upstream kernels,
some by design some actual bugs. This stat helps a lot in debugging such
problems when applications report decreased throughput with a new kernel
version.

This needed adding an extra elevator function to capture bios being merged as I
did not want to pollute elevator code with blkiocg knowledge and hence needed
the accounting invocation to come from CFQ.

Signed-off-by: Divyesh Shah
Signed-off-by: Jens Axboe

Divyesh Shah
2010-04-09 14:36:07 +0800

02 Apr, 2010

1 commit

a506aedc5 Block: Fix block/elevator.c elevator_get() off-by-one error ... Browse Code »

elevator_get() not check the name length, if the name length > sizeof(elv),
elv will miss the '\0'. And elv buffer will be replace "-iosched" as something
like aaaaaaaaa, then call request_module() can load an not trust module.

Signed-off-by: Zhitong Wang
Signed-off-by: Jens Axboe

wzt.wzt@gmail.com
2010-04-02 14:41:14 +0800

08 Mar, 2010

1 commit

52cf25d0a Driver core: Constify struct sysfs_ops in struct kobj_type ... Browse Code »

Constify struct sysfs_ops.

This is part of the ops structure constification
effort started by Arjan van de Ven et al.

Benefits of this constification:

* prevents modification of data that is shared
(referenced) by many other structure instances
at runtime

* detects/prevents accidental (but not intentional)
modification attempts on archs that enforce
read-only kernel data at runtime

* potentially better optimized code as the compiler
can assume that the const data cannot be changed

* the compiler/linker move const data into .rodata
and therefore exclude them from false sharing

Signed-off-by: Emese Revfy
Acked-by: David Teigland
Acked-by: Matt Domsch
Acked-by: Maciej Sosnowski
Acked-by: Hans J. Koch
Acked-by: Pekka Enberg
Acked-by: Jens Axboe
Acked-by: Stephen Hemminger
Signed-off-by: Greg Kroah-Hartman

Emese Revfy
2010-03-08 09:04:49 +0800

29 Jan, 2010

1 commit

488991e28 block: Added in stricter no merge semantics for block I/O ... Browse Code »

Updated 'nomerges' tunable to accept a value of '2' - indicating that _no_
merges at all are to be attempted (not even the simple one-hit cache).

The following table illustrates the additional benefit - 5 minute runs of
a random I/O load were applied to a dozen devices on a 16-way x86_64 system.

nomerges Throughput %System Improvement (tput / %sys)
-------- ------------ ----------- -------------------------
0 12.45 MB/sec 0.669365609
1 12.50 MB/sec 0.641519199 0.40% / 2.71%
2 12.52 MB/sec 0.639849750 0.56% / 2.96%

Signed-off-by: Alan D. Brunelle
Signed-off-by: Jens Axboe

Alan D. Brunelle
2010-01-29 16:04:08 +0800

13 Oct, 2009

1 commit

c30f33437 Merge branch 'for-linus' into for-2.6.33 Browse Code »

Jens Axboe
2009-10-13 18:29:45 +0800

09 Oct, 2009

1 commit

8c2795985 elv_iosched_store(): fix strstrip() misuse ... Browse Code »

elv_iosched_store() ignore the return value of strstrip(). It makes small
inconsistent behavior.

This patch fixes it.

====================================
# cd /sys/block/{blockdev}/queue

case1:
# echo "anticipatory" > scheduler
# cat scheduler
noop [anticipatory] deadline cfq

case2:
# echo "anticipatory " > scheduler
# cat scheduler
noop [anticipatory] deadline cfq

case3:
# echo " anticipatory" > scheduler
bash: echo: write error: Invalid argument

====================================
# cd /sys/block/{blockdev}/queue

case1:
# echo "anticipatory" > scheduler
# cat scheduler
noop [anticipatory] deadline cfq

case2:
# echo "anticipatory " > scheduler
# cat scheduler
noop [anticipatory] deadline cfq

case3:
# echo " anticipatory" > scheduler
noop [anticipatory] deadline cfq

Cc: Li Zefan
Cc: Jens Axboe
Signed-off-by: KOSAKI Motohiro
Signed-off-by: Jens Axboe

KOSAKI Motohiro
2009-10-09 14:48:08 +0800

03 Oct, 2009

1 commit

492af6350 block: remove the anticipatory IO scheduler ... Browse Code »

AS is mostly a subset of CFQ, so there's little point in still
providing this separate IO scheduler. Hopefully at some point we
can get down to one single IO scheduler again, at least this brings
us closer by having only one intelligent IO scheduler.

Signed-off-by: Jens Axboe

Jens Axboe
2009-10-03 15:37:51 +0800

11 Sep, 2009

2 commits

1f98a13f6 bio: first step in sanitizing the bio->bi_rw flag testing ... Browse Code »

Get rid of any functions that test for these bits and make callers
use bio_rw_flagged() directly. Then it is at least directly apparent
what variable and flag they check.

Signed-off-by: Jens Axboe

Jens Axboe
2009-09-11 20:33:31 +0800
da6c5c720 scsi,block: update SCSI to handle mixed merge failures ... Browse Code »

Update scsi_io_completion() such that it only fails requests till the
next error boundary and retry the leftover. This enables block layer
to merge requests with different failfast settings and still behave
correctly on errors. Allow merge of requests of different failfast
settings.

As SCSI is currently the only subsystem which follows failfast status,
there's no need to worry about other block drivers for now.

Signed-off-by: Tejun Heo
Cc: Niel Lambrechts
Cc: James Bottomley
Signed-off-by: Jens Axboe

Tejun Heo
2009-09-11 20:33:30 +0800

17 Jul, 2009

1 commit

0a09f4319 block: fix failfast merge testing in elv_rq_merge_ok() ... Browse Code »

Commit ab0fd1debe730ec9998678a0c53caefbd121ed10 tries to prevent merge
of requests with different failfast settings. In elv_rq_merge_ok(),
it compares new bio's failfast flags against the merge target
request's. However, the flag testing accessors for bio and blk don't
return boolean but the tested bit value directly and FAILFAST on bio
and blk don't match, so directly comparing them with == results in
false negative unnecessary preventing merge of readahead requests.

This patch convert the results to boolean by negating them before
comparison.

Signed-off-by: Tejun Heo
Cc: Jens Axboe
Cc: Boaz Harrosh
Cc: FUJITA Tomonori
Cc: James Bottomley
Cc: Jeff Garzik

Tejun Heo
2009-07-17 13:50:43 +0800

04 Jul, 2009

1 commit

ab0fd1deb block: don't merge requests of different failfast settings ... Browse Code »

Block layer used to merge requests and bios with different failfast
settings. This caused regular IOs to fail prematurely when they were
merged into failfast requests for readahead.

Niel Lambrechts could trigger the problem semi-reliably on ext4 when
resuming from STR. ext4 uses readahead when reading inodes and
combined with the deterministic extra SATA PHY exception cycle during
resume on the specific configuration, non-readahead inode read would
fail causing ext4 errors. Please read the following thread for
details.

http://lkml.org/lkml/2009/5/23/21

This patch makes block layer reject merging if the failfast settings
don't match. This is correct but likely to lower IO performance by
preventing regular IOs from mingling into surrounding readahead
requests. Changes to allow such mixed merges and handle errors
correctly will be added later.

Signed-off-by: Tejun Heo
Reported-by: Niel Lambrechts
Cc: Theodore Tso
Signed-off-by: Jens Axboe

Tejun Heo
2009-07-04 03:06:45 +0800

12 Jun, 2009

1 commit

c9059598e Merge branch 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block ... Browse Code »

* 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block: (153 commits)
block: add request clone interface (v2)
floppy: fix hibernation
ramdisk: remove long-deprecated "ramdisk=" boot-time parameter
fs/bio.c: add missing __user annotation
block: prevent possible io_context->refcount overflow
Add serial number support for virtio_blk, V4a
block: Add missing bounce_pfn stacking and fix comments
Revert "block: Fix bounce limit setting in DM"
cciss: decode unit attention in SCSI error handling code
cciss: Remove no longer needed sendcmd reject processing code
cciss: change SCSI error handling routines to work with interrupts enabled.
cciss: separate error processing and command retrying code in sendcmd_withirq_core()
cciss: factor out fix target status processing code from sendcmd functions
cciss: simplify interface of sendcmd() and sendcmd_withirq()
cciss: factor out core of sendcmd_withirq() for use by SCSI error handling code
cciss: Use schedule_timeout_uninterruptible in SCSI error handling code
block: needs to set the residual length of a bidi request
Revert "block: implement blkdev_readpages"
block: Fix bounce limit setting in DM
Removed reference to non-existing file Documentation/PCI/PCI-DMA-mapping.txt
...

Manually fix conflicts with tracing updates in:
block/blk-sysfs.c
drivers/ide/ide-atapi.c
drivers/ide/ide-cd.c
drivers/ide/ide-floppy.c
drivers/ide/ide-tape.c
include/trace/events/block.h
kernel/trace/blktrace.c

Linus Torvalds
2009-06-12 02:10:35 +0800

10 Jun, 2009

1 commit

55782138e tracing/events: convert block trace points to TRACE_EVENT() ... Browse Code »

TRACE_EVENT is a more generic way to define tracepoints. Doing so adds
these new capabilities to this tracepoint:

- zero-copy and per-cpu splice() tracing
- binary tracing without printf overhead
- structured logging records exposed under /debug/tracing/events
- trace events embedded in function tracer output and other plugins
- user-defined, per tracepoint filter expressions
...

Cons:

- no dev_t info for the output of plug, unplug_timer and unplug_io events.
no dev_t info for getrq and sleeprq events if bio == NULL.
no dev_t info for rq_abort,...,rq_requeue events if rq->rq_disk == NULL.

This is mainly because we can't get the deivce from a request queue.
But this may change in the future.

- A packet command is converted to a string in TP_assign, not TP_print.
While blktrace do the convertion just before output.

Since pc requests should be rather rare, this is not a big issue.

- In blktrace, an event can have 2 different print formats, but a TRACE_EVENT
has a unique format, which means we have some unused data in a trace entry.

The overhead is minimized by using __dynamic_array() instead of __array().

I've benchmarked the ioctl blktrace vs the splice based TRACE_EVENT tracing:

dd dd + ioctl blktrace dd + TRACE_EVENT (splice)
1 7.36s, 42.7 MB/s 7.50s, 42.0 MB/s 7.41s, 42.5 MB/s
2 7.43s, 42.3 MB/s 7.48s, 42.1 MB/s 7.43s, 42.4 MB/s
3 7.38s, 42.6 MB/s 7.45s, 42.2 MB/s 7.41s, 42.5 MB/s

So the overhead of tracing is very small, and no regression when using
those trace events vs blktrace.

And the binary output of TRACE_EVENT is much smaller than blktrace:

# ls -l -h
-rw-r--r-- 1 root root 8.8M 06-09 13:24 sda.blktrace.0
-rw-r--r-- 1 root root 195K 06-09 13:24 sda.blktrace.1
-rw-r--r-- 1 root root 2.7M 06-09 13:25 trace_splice.out

Following are some comparisons between TRACE_EVENT and blktrace:

plug:
kjournald-480 [000] 303.084981: block_plug: [kjournald]
kjournald-480 [000] 303.084981: 8,0 P N [kjournald]

unplug_io:
kblockd/0-118 [000] 300.052973: block_unplug_io: [kblockd/0] 1
kblockd/0-118 [000] 300.052974: 8,0 U N [kblockd/0] 1

remap:
kjournald-480 [000] 303.085042: block_remap: 8,0 W 102736992 + 8 v3:

- use the newly introduced __dynamic_array().

Changelog from v1 -> v2:

- use __string() instead of __array() to minimize the memory required
to store hex dump of rq->cmd().

- support large pc requests.

- add missing blk_fill_rwbs_rq() in block_rq_requeue TRACE_EVENT.

- some cleanups.

Signed-off-by: Li Zefan
LKML-Reference:
Signed-off-by: Steven Rostedt

Li Zefan
2009-06-10 00:34:23 +0800

02 Jun, 2009

1 commit

53c663ce0 block: fix a possible oops on elv_abort_queue() ... Browse Code »

I found one more mis-conversion to the 'request is always dequeued
when completing' model in elv_abort_queue() during code inspection.
Although I haven't hit any problem caused by this mis-conversion yet
and just done compile/boot test, please apply if you have no problem.

Request must be dequeued when it completes.
However, elv_abort_queue() completes requests without dequeueing.
This will cause oops in the __blk_end_request_all().
This patch fixes the oops.

Signed-off-by: Kiyoshi Ueda
Signed-off-by: Jun'ichi Nomura
Signed-off-by: Jens Axboe

Kiyoshi Ueda
2009-06-02 14:44:01 +0800

23 May, 2009

1 commit

cd43e26f0 block: Expose stacked device queues in sysfs ... Browse Code »

Currently stacking devices do not have a queue directory in sysfs.
However, many of the I/O characteristics like sector size, maximum
request size, etc. are queue properties.

This patch enables the queue directory for MD/DM devices. The elevator
code has been modified to deal with queues that do not have an I/O
scheduler.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-05-23 05:22:55 +0800

20 May, 2009

1 commit

0a7ae2ff0 block: change the tag sync vs async restriction logic ... Browse Code »

Make them fully share the tag space, but disallow async requests using
the last any two slots.

Signed-off-by: Jens Axboe

Jens Axboe
2009-05-20 14:54:31 +0800

11 May, 2009

1 commit

83096ebf1 block: convert to pos and nr_sectors accessors ... Browse Code »

With recent cleanups, there is no place where low level driver
directly manipulates request fields. This means that the 'hard'
request fields always equal the !hard fields. Convert all
rq->sectors, nr_sectors and current_nr_sectors references to
accessors.

While at it, drop superflous blk_rq_pos() < 0 test in swim.c.

[ Impact: use pos and nr_sectors accessors ]

Signed-off-by: Tejun Heo
Acked-by: Geert Uytterhoeven
Tested-by: Grant Likely
Acked-by: Grant Likely
Tested-by: Adrian McMenamin
Acked-by: Adrian McMenamin
Acked-by: Mike Miller
Cc: James Bottomley
Cc: Bartlomiej Zolnierkiewicz
Cc: Borislav Petkov
Cc: Sergei Shtylyov
Cc: Eric Moore
Cc: Alan Stern
Cc: FUJITA Tomonori
Cc: Pete Zaitcev
Cc: Stephen Rothwell
Cc: Paul Clements
Cc: Tim Waugh
Cc: Jeff Garzik
Cc: Jeremy Fitzhardinge
Cc: Alex Dubov
Cc: David Woodhouse
Cc: Martin Schwidefsky
Cc: Dario Ballabio
Cc: David S. Miller
Cc: Rusty Russell
Cc: unsik Kim
Cc: Laurent Vivier
Signed-off-by: Jens Axboe

Tejun Heo
2009-05-11 15:50:54 +0800

28 Apr, 2009

3 commits

40cbbb781 block: implement and use [__]blk_end_request_all() ... Browse Code »

There are many [__]blk_end_request() call sites which call it with
full request length and expect full completion. Many of them ensure
that the request actually completes by doing BUG_ON() the return
value, which is awkward and error-prone.

This patch adds [__]blk_end_request_all() which takes @rq and @error
and fully completes the request. BUG_ON() is added to to ensure that
this actually happens.

Most conversions are simple but there are a few noteworthy ones.

* cdrom/viocd: viocd_end_request() replaced with direct calls to
__blk_end_request_all().

* s390/block/dasd: dasd_end_request() replaced with direct calls to
__blk_end_request_all().

* s390/char/tape_block: tapeblock_end_request() replaced with direct
calls to blk_end_request_all().

[ Impact: cleanup ]

Signed-off-by: Tejun Heo
Cc: Russell King
Cc: Stephen Rothwell
Cc: Mike Miller
Cc: Martin Schwidefsky
Cc: Jeff Garzik
Cc: Rusty Russell
Cc: Jeremy Fitzhardinge
Cc: Alex Dubov
Cc: James Bottomley

Tejun Heo
2009-04-28 13:37:35 +0800
158dbda00 block: reorganize request fetching functions ... Browse Code »

Impact: code reorganization

elv_next_request() and elv_dequeue_request() are public block layer
interface than actual elevator implementation. They mostly deal with
how requests interact with block layer and low level drivers at the
beginning of rqeuest processing whereas __elv_next_request() is the
actual eleveator request fetching interface.

Move the two functions to blk-core.c. This prepares for further
interface cleanup.

Signed-off-by: Tejun Heo

Tejun Heo
2009-04-28 13:37:34 +0800
a7f557923 block: kill blk_start_queueing() ... Browse Code »

blk_start_queueing() is identical to __blk_run_queue() except that it
doesn't check for recursion. None of the current users depends on
blk_start_queueing() running request_fn directly. Replace usages of
blk_start_queueing() with [__]blk_run_queue() and kill it.

[ Impact: removal of mostly duplicate interface function ]

Signed-off-by: Tejun Heo

Tejun Heo
2009-04-28 13:37:33 +0800

15 Apr, 2009

1 commit

f600abe2d block: fix bad spelling of quiesce ... Browse Code »

Credit goes to Andrew Morton for spotting this one.

Signed-off-by: Jens Axboe

Jens Axboe
2009-04-15 14:28:09 +0800

07 Apr, 2009

2 commits

26308eab6 block: fix inconsistency in I/O stat accounting code ... Browse Code »

This forces in_flight to be zero when turning off or on the I/O stat
accounting and stops updating I/O stats in attempt_merge() when
accounting is turned off.

Signed-off-by: Jerome Marchand
Signed-off-by: Jens Axboe

Jerome Marchand
2009-04-07 14:12:38 +0800
6c7e8cee6 block: elevator quiescing helpers ... Browse Code »

Simple helper functions to quiesce the request queue. These are
currently only used for switching IO schedulers on-the-fly, but
we can use them to properly switch IO accounting on and off as well.

Signed-off-by: Jerome Marchand
Signed-off-by: Jens Axboe

Jens Axboe
2009-04-07 14:12:37 +0800

06 Apr, 2009

1 commit

1faa16d22 block: change the request allocation/congestion logic to be sync/async based ... Browse Code »

This makes sure that we never wait on async IO for sync requests, instead
of doing the split on writes vs reads.

Signed-off-by: Jens Axboe
Signed-off-by: Linus Torvalds

Jens Axboe
2009-04-06 23:04:53 +0800

29 Dec, 2008

3 commits

b374d18a4 block: get rid of elevator_t typedef ... Browse Code »

Just use struct elevator_queue everywhere instead.

Signed-off-by: Jens Axboe

Jens Axboe
2008-12-29 15:29:50 +0800
58eea927d block: simplify empty barrier implementation ... Browse Code »

Empty barrier required special handling in __elv_next_request() to
complete it without letting the low level driver see it.

With previous changes, barrier code is now flexible enough to skip the
BAR step using the same barrier sequence selection mechanism. Drop
the special handling and mask off q->ordered from start_ordered().

Remove blk_empty_barrier() test which now has no user.

Signed-off-by: Tejun Heo
Signed-off-by: Jens Axboe

Tejun Heo
2008-12-29 15:28:45 +0800
8f11b3e99 block: make barrier completion more robust ... Browse Code »

Barrier completion had the following assumptions.

* start_ordered() couldn't finish the whole sequence properly. If all
actions are to be skipped, q->ordseq is set correctly but the actual
completion was never triggered thus hanging the barrier request.

* Drain completion in elv_complete_request() assumed that there's
always at least one request in the queue when drain completes.

Both assumptions are true but these assumptions need to be removed to
improve empty barrier implementation. This patch makes the following
changes.

* Make start_ordered() use blk_ordered_complete_seq() to mark skipped
steps complete and notify __elv_next_request() that it should fetch
the next request if the whole barrier has completed inside
start_ordered().

* Make drain completion path in elv_complete_request() check whether
the queue is empty. Empty queue also indicates drain completion.

* While at it, convert 0/1 return from blk_do_ordered() to false/true.

Signed-off-by: Tejun Heo
Signed-off-by: Jens Axboe

Tejun Heo
2008-12-29 15:28:45 +0800

05 Dec, 2008

1 commit

970987beb Merge branches 'tracing/ftrace', 'tracing/function-graph-tracer' and 'tracing/ur… ... Browse Code »

…gent' into tracing/core

Ingo Molnar
2008-12-05 21:45:22 +0800

03 Dec, 2008

1 commit

53a08807c block: internal dequeue shouldn't start timer ... Browse Code »

blkdev_dequeue_request() and elv_dequeue_request() are equivalent and
both start the timeout timer. Barrier code dequeues the original
barrier request but doesn't passes the request itself to lower level
driver, only broken down proxy requests; however, as the original
barrier code goes through the same dequeue path and timeout timer is
started on it. If barrier sequence takes long enough, this timer
expires but the low level driver has no idea about this request and
oops follows.

Timeout timer shouldn't have been started on the original barrier
request as it never goes through actual IO. This patch unexports
elv_dequeue_request(), which has no external user anyway, and makes it
operate on elevator proper w/o adding the timer and make
blkdev_dequeue_request() call elv_dequeue_request() and add timer.
Internal users which don't pass the request to driver - barrier code
and end_that_request_last() - are converted to use
elv_dequeue_request().

Signed-off-by: Tejun Heo
Cc: Mike Anderson
Signed-off-by: Jens Axboe

Tejun Heo
2008-12-03 19:41:26 +0800

26 Nov, 2008

2 commits

0bfc24559 blktrace: port to tracepoints, update ... Browse Code »

Port to the new tracepoints API: split DEFINE_TRACE() and DECLARE_TRACE()
sites. Spread them out to the usage sites, as suggested by
Mathieu Desnoyers.

Signed-off-by: Ingo Molnar
Acked-by: Mathieu Desnoyers

Ingo Molnar
2008-11-26 20:04:35 +0800
5f3ea37c7 blktrace: port to tracepoints ... Browse Code »

This was a forward port of work done by Mathieu Desnoyers, I changed it to
encode the 'what' parameter on the tracepoint name, so that one can register
interest in specific events and not on classes of events to then check the
'what' parameter.

Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Jens Axboe
Signed-off-by: Ingo Molnar

Arnaldo Carvalho de Melo
2008-11-26 19:13:34 +0800

06 Nov, 2008

1 commit

2920ebbd6 block: add timer on blkdev_dequeue_request() not elv_next_request() ... Browse Code »

Block queue supports two usage models - one where block driver peeks
at the front of queue using elv_next_request(), processes it and
finishes it and the other where block driver peeks at the front of
queue, dequeue the request using blkdev_dequeue_request() and finishes
it. The latter is more flexible as it allows the driver to process
multiple commands concurrently.

These two inconsistent usage models affect the block layer
implementation confusing. For some, elv_next_request() is considered
the issue point while others consider blkdev_dequeue_request() the
issue point.

Till now the inconsistency mostly affect only accounting, so it didn't
really break anything seriously; however, with block layer timeout,
this inconsistency hits hard. Block layer considers
elv_next_request() the issue point and adds timer but SCSI layer
thinks it was just peeking and when the request can't process the
command right away, it's just left there without further processing.
This makes the request dangling on the timer list and, when the timer
goes off, the request which the SCSI layer and below think is still on
the block queue ends up in the EH queue, causing various problems - EH
hang (failed count goes over busy count and EH never wakes up),
WARN_ON() and oopses as low level driver trying to handle the unknown
command, etc. depending on the timing.

As SCSI midlayer is the only user of block layer timer at the moment,
moving blk_add_timer() to elv_dequeue_request() fixes the problem;
however, this two usage models definitely need to be cleaned up in the
future.

Signed-off-by: Tejun Heo
Signed-off-by: Jens Axboe

Tejun Heo
2008-11-06 15:41:55 +0800

17 Oct, 2008

1 commit

80a4b58e3 block: only call ->request_fn when the queue is not stopped ... Browse Code »

Callers should use either blk_run_queue/__blk_run_queue, or
blk_start_queueing() to invoke request handling instead of calling
->request_fn() directly as that does not take the queue stopped
flag into account.

Also add appropriate comments on the above functions to detail
their usage.

Signed-off-by: Jens Axboe

Jens Axboe
2008-10-17 14:46:57 +0800