Doug / smarc-fsl-linux-kernel | Embedian Git Server

18 Oct, 2008

1 commit

c53dbf548 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block ... Browse Code »

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: remove __generic_unplug_device() from exports
block: move q->unplug_work initialization
blktrace: pass zfcp driver data
blktrace: add support for driver data
block: fix current kernel-doc warnings
block: only call ->request_fn when the queue is not stopped
block: simplify string handling in elv_iosched_store()
block: fix kernel-doc for blk_alloc_devt()
block: fix nr_phys_segments miscalculation bug
block: add partition attribute for partition number
block: add BIG FAT WARNING to CONFIG_DEBUG_BLOCK_EXT_DEVT
softirq: Add support for triggering softirq work on softirqs.

Linus Torvalds
2008-10-18 00:29:55 +0800

17 Oct, 2008

1 commit

f73e2d13a block: remove __generic_unplug_device() from exports ... Browse Code »

The only out-of-core user is IDE, and that should be using
blk_start_queueing() instead.

Signed-off-by: Jens Axboe

Jens Axboe
2008-10-17 20:03:08 +0800

13 Oct, 2008

1 commit

6000a368c [SCSI] block: separate failfast into multiple bits. ... Browse Code »

Multipath is best at handling transport errors. If it gets a device
error then there is not much the multipath layer can do. It will just
access the same device but from a different path.

This patch breaks up failfast into device, transport and driver errors.
The multipath layers (md and dm mutlipath) only ask the lower levels to
fast fail transport errors. The user of failfast, read ahead, will ask
to fast fail on all errors.

Note that blk_noretry_request will return true if any failfast bit
is set. This allows drivers that do not support the multipath failfast
bits to continue to fail on any failfast error like before. Drivers
like scsi that are able to fail fast specific errors can check
for the specific fail fast type. In the next patch I will convert
scsi.

Signed-off-by: Mike Christie
Cc: Jens Axboe
Signed-off-by: James Bottomley

Mike Christie
2008-10-13 21:28:52 +0800

09 Oct, 2008

27 commits

b02739b01 block: gendisk integrity wrapper ... Browse Code »

This is a wrapper for accessing a gendisk's integrity bits. It allows
the integrity support in MD to be compiled with BLK_DEV_INTEGRITY off.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2008-10-09 14:56:22 +0800
ad7fce931 block: Switch blk_integrity_compare from bdev to gendisk ... Browse Code »

The DM and MD integrity support now depends on being able to use
gendisks instead of block_devices when comparing integrity profiles.
Change function parameters accordingly.

Also update comparison logic so that two NULL profiles are a valid
configuration.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2008-10-09 14:56:21 +0800
b04accc42 block: revert part of d7533ad0e132f92e75c1b2eb7c26387b25a583c1 ... Browse Code »

We need bdev_get_integrity() to support the pending md/dm patches.

Signed-off-by: Jens Axboe

Jens Axboe
2008-10-09 14:56:21 +0800
d00e29fd9 block: remove end_{queued|dequeued}_request() ... Browse Code »

This patch removes end_queued_request() and end_dequeued_request(),
which are no longer used.

As a results, users of __end_request() became only end_request().
So the actual code in __end_request() is moved to end_request()
and __end_request() is removed.

Signed-off-by: Kiyoshi Ueda
Signed-off-by: Jun'ichi Nomura
Signed-off-by: Jens Axboe

Kiyoshi Ueda
2008-10-09 14:56:21 +0800
ef9e3facd block: add lld busy state exporting interface ... Browse Code »

This patch adds an new interface, blk_lld_busy(), to check lld's
busy state from the block layer.
blk_lld_busy() calls down into low-level drivers for the checking
if the drivers set q->lld_busy_fn() using blk_queue_lld_busy().

This resolves a performance problem on request stacking devices below.

Some drivers like scsi mid layer stop dispatching request when
they detect busy state on its low-level device like host/target/device.
It allows other requests to stay in the I/O scheduler's queue
for a chance of merging.

Request stacking drivers like request-based dm should follow
the same logic.
However, there is no generic interface for the stacked device
to check if the underlying device(s) are busy.
If the request stacking driver dispatches and submits requests to
the busy underlying device, the requests will stay in
the underlying device's queue without a chance of merging.
This causes performance problem on burst I/O load.

With this patch, busy state of the underlying device is exported
via q->lld_busy_fn(). So the request stacking driver can check it
and stop dispatching requests if busy.

The underlying device driver must return the busy state appropriately:
1: when the device driver can't process requests immediately.
0: when the device driver can process requests immediately,
including abnormal situations where the device driver needs
to kill all requests.

Signed-off-by: Kiyoshi Ueda
Signed-off-by: Jun'ichi Nomura
Cc: Andrew Morton
Signed-off-by: Jens Axboe

Kiyoshi Ueda
2008-10-09 14:56:20 +0800
a68bbddba block: add queue flag for SSD/non-rotational devices ... Browse Code »

We don't want to idle in AS/CFQ if the device doesn't have a seek
penalty. So add a QUEUE_FLAG_NONROT to indicate a non-rotational
device, low level drivers should set this flag upon discovery of
an SSD or similar device type.

Signed-off-by: Jens Axboe

Jens Axboe
2008-10-09 14:56:19 +0800
4ee5eaf45 block: add a queue flag for request stacking support ... Browse Code »

This patch adds a queue flag to indicate the block device can be
used for request stacking.

Request stacking drivers need to stack their devices on top of
only devices of which q->request_fn is functional.
Since bio stacking drivers (e.g. md, loop) basically initialize
their queue using blk_alloc_queue() and don't set q->request_fn,
the check of (q->request_fn == NULL) looks enough for that purpose.

However, dm will become both types of stacking driver (bio-based and
request-based). And dm will always set q->request_fn even if the dm
device is bio-based of which q->request_fn is not functional actually.
So we need something else to distinguish the type of the device.
Adding a queue flag is a solution for that.

The reason why dm always sets q->request_fn is to keep
the compatibility of dm user-space tools.
Currently, all dm user-space tools are using bio-based dm without
specifying the type of the dm device they use.
To use request-based dm without changing such tools, the kernel
must decide the type of the dm device automatically.
The automatic type decision can't be done at the device creation time
and needs to be deferred until such tools load a mapping table,
since the actual type is decided by dm target type included in
the mapping table.

So a dm device has to be initialized using blk_init_queue()
so that we can load either type of table.
Then, all queue stuffs are set (e.g. q->request_fn) and we have
no element to distinguish that it is bio-based or request-based,
even after a table is loaded and the type of the device is decided.

By the way, some stuffs of the queue (e.g. request_list, elevator)
are needless when the dm device is used as bio-based.
But the memory size is not so large (about 20[KB] per queue on ia64),
so I hope the memory loss can be acceptable for bio-based dm users.

Signed-off-by: Kiyoshi Ueda
Signed-off-by: Jun'ichi Nomura
Signed-off-by: Jens Axboe

Kiyoshi Ueda
2008-10-09 14:56:18 +0800
82124d603 block: add request submission interface ... Browse Code »

This patch adds blk_insert_cloned_request(), a generic request
submission interface for request stacking drivers.
Request-based dm will use it to submit their clones to underlying
devices.

blk_rq_check_limits() is also added because it is possible that
the lower queue has stronger limitations than the upper queue
if multiple drivers are stacking at request-level.
Not only for blk_insert_cloned_request()'s internal use, the function
will be used by request-based dm when the queue limitation is
modified (e.g. by replacing dm's table).

Signed-off-by: Kiyoshi Ueda
Signed-off-by: Jun'ichi Nomura
Signed-off-by: Jens Axboe

Kiyoshi Ueda
2008-10-09 14:56:18 +0800
32fab448e block: add request update interface ... Browse Code »

This patch adds blk_update_request(), which updates struct request
with completing its data part, but doesn't complete the struct
request itself.
Though it looks like end_that_request_first() of older kernels,
blk_update_request() should be used only by request stacking drivers.

Request-based dm will use it in bio->bi_end_io callback to update
the original request when a data part of a cloned request completes.
Followings are additional background information of why request-based
dm needs this interface.

- Request stacking drivers can't use blk_end_request() directly from
the lower driver's completion context (bio->bi_end_io or rq->end_io),
because some device drivers (e.g. ide) may try to complete
their request with queue lock held, and it may cause deadlock.
See below for detailed description of possible deadlock:

- To solve that, request-based dm offloads the completion of
cloned struct request to softirq context (i.e. using
blk_complete_request() from rq->end_io).

- Though it is possible to use the same solution from bio->bi_end_io,
it will delay the notification of bio completion to the original
submitter. Also, it will cause inefficient partial completion,
because the lower driver can't perform the cloned request anymore
and request-based dm needs to requeue and redispatch it to
the lower driver again later. That's not good.

- So request-based dm needs blk_update_request() to perform the bio
completion in the lower driver's completion context, which is more
efficient.

Signed-off-by: Kiyoshi Ueda
Signed-off-by: Jun'ichi Nomura
Signed-off-by: Jens Axboe

Kiyoshi Ueda
2008-10-09 14:56:18 +0800
9c02f2b02 block: cleanup some of the integrity stuff in blkdev.h ... Browse Code »

Don't put functions that are only used in fs/bio-integrity.c in
blkdev.h, it's much cleaner to just keep it in there. Also kill
completely unused bdev_get_tag_size()

Signed-off-by: Jens Axboe

Jens Axboe
2008-10-09 14:56:17 +0800
581d4e28d block: add fault injection mechanism for faking request timeouts ... Browse Code »

Only works for the generic request timer handling. Allows one to
sporadically ignore request completions, thus exercising the timeout
handling.

Signed-off-by: Jens Axboe

Jens Axboe
2008-10-09 14:56:17 +0800
3e6053d76 block: adjust blkdev_issue_discard for swap ... Browse Code »

Two mods to blkdev_issue_discard(), thinking ahead to its use on swap:

1. Add gfp_mask argument, so swap allocation can use it where GFP_KERNEL
might deadlock but GFP_NOIO is safe.

2. Enlarge nr_sects argument from unsigned to sector_t: unsigned long is
enough to cover a whole swap area, but sector_t suits any partition.

Change sb_issue_discard()'s nr_blocks to sector_t too; but no need seen
for a gfp_mask there, just pass GFP_KERNEL down to blkdev_issue_discard().

Signed-off-by: Hugh Dickins
Signed-off-by: Jens Axboe

Hugh Dickins
2008-10-09 14:56:17 +0800
11914a53d block: Add interface to abort queued requests ... Browse Code »

Signed-off-by: Mike Anderson
Signed-off-by: Jens Axboe

Mike Anderson
2008-10-09 14:56:13 +0800
242f9dcb8 block: unify request timeout handling ... Browse Code »

Right now SCSI and others do their own command timeout handling.
Move those bits to the block layer.

Instead of having a timer per command, we try to be a bit more clever
and simply have one per-queue. This avoids the overhead of having to
tear down and setup a timer for each command, so it will result in a lot
less timer fiddling.

Signed-off-by: Mike Anderson
Signed-off-by: Jens Axboe

Jens Axboe
2008-10-09 14:56:13 +0800
879040742 block: add blk_rq_aligned helper function ... Browse Code »

This adds blk_rq_aligned helper function to see if alignment and
padding requirement is satisfied for DMA transfer. This also converts
blk_rq_map_kern and __blk_rq_map_user to use the helper function.

Signed-off-by: FUJITA Tomonori
Cc: Jens Axboe
Signed-off-by: Jens Axboe

FUJITA Tomonori
2008-10-09 14:56:11 +0800
152e283fd block: introduce struct rq_map_data to use reserved pages ... Browse Code »

This patch introduces struct rq_map_data to enable bio_copy_use_iov()
use reserved pages.

Currently, bio_copy_user_iov allocates bounce pages but
drivers/scsi/sg.c wants to allocate pages by itself and use
them. struct rq_map_data can be used to pass allocated pages to
bio_copy_user_iov.

The current users of bio_copy_user_iov simply passes NULL (they don't
want to use pre-allocated pages).

Signed-off-by: FUJITA Tomonori
Cc: Jens Axboe
Cc: Douglas Gilbert
Cc: Mike Christie
Cc: James Bottomley
Signed-off-by: Jens Axboe

FUJITA Tomonori
2008-10-09 14:56:10 +0800
a3bce90ed block: add gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov ... Browse Code »

Currently, blk_rq_map_user and blk_rq_map_user_iov always do
GFP_KERNEL allocation.

This adds gfp_mask argument to blk_rq_map_user and blk_rq_map_user_iov
so sg can use it (sg always does GFP_ATOMIC allocation).

Signed-off-by: FUJITA Tomonori
Signed-off-by: Douglas Gilbert
Cc: Mike Christie
Cc: James Bottomley
Signed-off-by: Jens Axboe

FUJITA Tomonori
2008-10-09 14:56:10 +0800
ab780f1ec block: inherit CPU completion on bio->rq and rq->rq merges ... Browse Code »

Somewhat incomplete, as we do allow merges of requests and bios
that have different completion CPUs given. This is done on the
assumption that a larger IO is still more beneficial than CPU
locality.

Signed-off-by: Jens Axboe

Jens Axboe
2008-10-09 14:56:09 +0800
c7c22e4d5 block: add support for IO CPU affinity ... Browse Code »

This patch adds support for controlling the IO completion CPU of
either all requests on a queue, or on a per-request basis. We export
a sysfs variable (rq_affinity) which, if set, migrates completions
of requests to the CPU that originally submitted it. A bio helper
(bio_set_completion_cpu()) is also added, so that queuers can ask
for completion on that specific CPU.

In testing, this has been show to cut the system time by as much
as 20-40% on synthetic workloads where CPU affinity is desired.

This requires a little help from the architecture, so it'll only
work as designed for archs that are using the new generic smp
helper infrastructure.

Signed-off-by: Jens Axboe

Jens Axboe
2008-10-09 14:56:09 +0800
18887ad91 block: make kblockd_schedule_work() take the queue as parameter ... Browse Code »

Preparatory patch for checking queuing affinity.

Signed-off-by: Jens Axboe

Jens Axboe
2008-10-09 14:56:09 +0800
5df97b91b drop vmerge accounting ... Browse Code »

Remove hw_segments field from struct bio and struct request. Without virtual
merge accounting they have no purpose.

Signed-off-by: Mikulas Patocka
Signed-off-by: Jens Axboe

Mikulas Patocka
2008-10-09 14:56:03 +0800
766ca4428 virtio_blk: use a wrapper function to access io context information of IO requests ... Browse Code »

struct request has an ioprio member but it is never updated because
currently bios do not hold io context information. The implication of
this is that virtio_blk ends up passing useless information to the
backend driver.

That said, some IO schedulers such as CFQ do store io context
information in struct request, but use private members for that, which
means that that information cannot be directly accessed in a IO
scheduler-independent way.

This patch adds a function to obtain the ioprio of a request. We should
avoid accessing ioprio directly and use this function instead, so that
its users do not have to care about future changes in block layer
structures or what the currently active IO controller is.

This patch does not introduce any functional changes but paves the way
for future clean-ups and enhancements.

Signed-off-by: Fernando Luis Vazquez Cao
Acked-by: Rusty Russell
Signed-off-by: Jens Axboe

Fernando Luis Vázquez Cao
2008-10-09 14:56:02 +0800
1a8e2bddd Kill REQ_TYPE_FLUSH ... Browse Code »

It was only used by ps3disk, and it should probably have been
REQ_TYPE_LINUX_BLOCK + REQ_LB_OP_FLUSH.

Signed-off-by: David Woodhouse
Signed-off-by: Jens Axboe

David Woodhouse
2008-10-09 14:56:02 +0800
e17fc0a1c Allow elevators to sort/merge discard requests ... Browse Code »

But blkdev_issue_discard() still emits requests which are interpreted as
soft barriers, because naïve callers might otherwise issue subsequent
writes to those same sectors, which might cross on the queue (if they're
reallocated quickly enough).

Callers still _can_ issue non-barrier discard requests, but they have to
take care of queue ordering for themselves.

Signed-off-by: David Woodhouse
Signed-off-by: Jens Axboe

David Woodhouse
2008-10-09 14:56:02 +0800
eae9acd13 Support 'discard sectors' operation in translation layer support core ... Browse Code »

Signed-off-by: David Woodhouse
Signed-off-by: Jens Axboe

David Woodhouse
2008-10-09 14:56:01 +0800
fb2dce862 Add 'discard' request handling ... Browse Code »

Some block devices benefit from a hint that they can forget the contents
of certain sectors. Add basic support for this to the block core, along
with a 'blkdev_issue_discard()' helper function which issues such
requests.

The caller doesn't get to provide an end_io functio, since
blkdev_issue_discard() will automatically split the request up into
multiple bios if appropriate. Neither does the function wait for
completion -- it's expected that callers won't care about when, or even
_if_, the request completes. It's only a hint to the device anyway. By
definition, the file system doesn't _care_ about these sectors any more.

[With feedback from OGAWA Hirofumi and
Jens Axboe
Signed-off-by: Jens Axboe

David Woodhouse
2008-10-09 14:56:01 +0800
d628eaef3 Fix up comments about matching flags between bio and rq ... Browse Code »

Signed-off-by: David Woodhouse
Signed-off-by: Jens Axboe

David Woodhouse
2008-10-09 14:56:01 +0800

11 Sep, 2008

1 commit

2dc75d3c3 block: disable sysfs parts of the disk command filter ... Browse Code »

We still have life time issues with the sysfs command filter kobject,
so disable it for 2.6.27 release. We can revisit this and make it work
properly for 2.6.28, for 2.6.27 release it's too risky.

Signed-off-by: Jens Axboe

Jens Axboe
2008-09-11 20:20:23 +0800

27 Aug, 2008

3 commits

5168c47b4 block: remove blk_queue_tag_depth() and blk_queue_tag_queue() ... Browse Code »

They are unused and ->busy doesn't exist anymore.

Signed-off-by: Jens Axboe

Jens Axboe
2008-08-27 15:50:20 +0800
4beab5c62 block: rename blk_scsi_cmd_filter to blk_cmd_filter ... Browse Code »

Technically, the cmd_filter would be applied to other protocols though
it's unlikely to happen. Putting SCSI stuff to request_queue is kinda
layer violation. So let's rename it.

Signed-off-by: FUJITA Tomonori
Signed-off-by: Jens Axboe

FUJITA Tomonori
2008-08-27 15:50:19 +0800
abf543937 block: move cmdfilter from gendisk to request_queue ... Browse Code »

cmd_filter works only for the block layer SG_IO with SCSI block
devices. It breaks scsi/sg.c, bsg, and the block layer SG_IO with SCSI
character devices (such as st). We hit a kernel crash with them.

The problem is that cmd_filter code accesses to gendisk (having struct
blk_scsi_cmd_filter) via inode->i_bdev->bd_disk. It works for only
SCSI block device files. With character device files, inode->i_bdev
leads you to struct cdev. inode->i_bdev->bd_disk->blk_scsi_cmd_filter
isn't safe.

SCSI ULDs don't expose gendisk; they keep it private. bsg needs to be
independent on any protocols. We shouldn't change ULDs to expose their
gendisk.

This patch moves struct blk_scsi_cmd_filter from gendisk to
request_queue, a common object, which eveyone can access to.

The user interface doesn't change; users can change the filters via
/sys/block/. gendisk has a pointer to request_queue so the cmd_filter
code accesses to struct blk_scsi_cmd_filter.

Signed-off-by: FUJITA Tomonori
Signed-off-by: Jens Axboe

FUJITA Tomonori
2008-08-27 15:50:19 +0800

02 Aug, 2008

1 commit

6c5e0c4d5 block: add a blk_plug_device_unlocked() that grabs the queue lock ... Browse Code »

blk_plug_device() must be called with the queue lock held, so callers
often just grab and release the lock for that purpose. Add a helper
that does just that.

Signed-off-by: Jens Axboe

Jens Axboe
2008-08-02 02:31:32 +0800

17 Jul, 2008

1 commit

d442cc44c block: Trivial fix for blk_integrity_rq() ... Browse Code »

Fail integrity check gracefully when request does not have a bio
attached (BLOCK_PC).

Signed-off-by: Martin K. Petersen
Signed-off-by: Linus Torvalds

Martin K. Petersen
2008-07-17 05:51:41 +0800

16 Jul, 2008

2 commits

98339cbd3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (80 commits)
ide-floppy: fix unfortunate function naming
ide-tape: unify idetape_create_read/write_cmd
ide: add ide_pc_intr() helper
ide-{floppy,scsi}: read Status Register before stopping DMA engine
ide-scsi: add more debugging to idescsi_pc_intr()
ide-scsi: use pc->callback
ide-floppy: add more debugging to idefloppy_pc_intr()
ide-tape: always log debug info in idetape_pc_intr() if debugging is enabled
ide-tape: add ide_tape_io_buffers() helper
ide-tape: factor out DSC handling from idetape_pc_intr()
ide-{floppy,tape}: move checking of ->failed_pc to ->callback
ide: add ide_issue_pc() helper
ide: add PC_FLAG_DRQ_INTERRUPT pc flag
ide-scsi: move idescsi_map_sg() call out from idescsi_issue_pc()
ide: add ide_transfer_pc() helper
ide-scsi: set drive->scsi flag for devices handled by the driver
ide-{cd,floppy,tape}: remove checking for drive->scsi
ide: add PC_FLAG_ZIP_DRIVE pc flag
ide-tape: factor out waiting for good ireason from idetape_transfer_pc()
ide-tape: set PC_FLAG_DMA_IN_PROGRESS flag in idetape_transfer_pc()
...

Linus Torvalds
2008-07-16 02:15:36 +0800
681a561b7 block: unexport blk_end_sync_rq ... Browse Code »

All the users of blk_end_sync_rq has gone (they are converted to use
blk_execute_rq). This unexports blk_end_sync_rq.

Signed-off-by: FUJITA Tomonori
Cc: Borislav Petkov
Signed-off-by: Jens Axboe
Signed-off-by: Bartlomiej Zolnierkiewicz

FUJITA Tomonori
2008-07-16 03:21:45 +0800

04 Jul, 2008

1 commit

27f8221af block: add blk_queue_update_dma_pad ... Browse Code »

This adds blk_queue_update_dma_pad to prevent LLDs from overwriting
the dma pad mask wrongly (we added blk_queue_update_dma_alignment due
to the same reason).

This also converts libata to use blk_queue_update_dma_pad instead of
blk_queue_dma_pad.

Signed-off-by: FUJITA Tomonori
Cc: Tejun Heo
Cc: Bartlomiej Zolnierkiewicz
Cc: Thomas Bogendoerfer
Cc: James Bottomley
Signed-off-by: Andrew Morton
Signed-off-by: Jens Axboe

FUJITA Tomonori
2008-07-04 15:52:13 +0800

03 Jul, 2008

1 commit

e48ec6900 block: extend queue_flag bitops ... Browse Code »

Add test_and_clear and test_and_set.

Signed-off-by: Jens Axboe

Jens Axboe
2008-07-03 19:21:15 +0800