Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

12 Oct, 2009

1 commit

c7ebf0657 blk-settings: fix function parameter kernel-doc notation ... Browse Code »

Fix kernel-doc notation in blk-settings.c::blk_queue_max_discard_sectors().

Signed-off-by: Randy Dunlap
Signed-off-by: Jens Axboe

Randy Dunlap
2009-10-12 14:20:47 +0800

02 Oct, 2009

4 commits

67efc9258 block: allow large discard requests ... Browse Code »

Currently we set the bio size to the byte equivalent of the blocks to
be trimmed when submitting the initial DISCARD ioctl. That means it
is subject to the max_hw_sectors limitation of the HBA which is
much lower than the size of a DISCARD request we can support.
Add a separate max_discard_sectors tunable to limit the size for discard
requests.

We limit the max discard request size in bytes to 32bit as that is the
limit for bio->bi_size. This could be much larger if we had a way to pass
that information through the block layer.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2009-10-02 03:19:34 +0800
c15227de1 block: use normal I/O path for discard requests ... Browse Code »

prepare_discard_fn() was being called in a place where memory allocation
was effectively impossible. This makes it inappropriate for all but
the most trivial translations of Linux's DISCARD operation to the block
command set. Additionally adding a payload there makes the ownership
of the bio backing unclear as it's now allocated by the device driver
and not the submitter as usual.

It is replaced with QUEUE_FLAG_DISCARD which is used to indicate whether
the queue supports discard operations or not. blkdev_issue_discard now
allocates a one-page, sector-length payload which is the right thing
for the common ATA and SCSI implementations.

The mtd implementation of prepare_discard_fn() is replaced with simply
checking for the request being a discard.

Largely based on a previous patch from Matthew Wilcox
which did the prepare_discard_fn but not the different payload allocation
yet.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2009-10-02 03:19:30 +0800
5dee2477d block: Do not clamp max_hw_sectors for stacking devices ... Browse Code »

Stacking devices do not have an inherent max_hw_sector limit. Set the
default to INT_MAX so we are bounded only by capabilities of the
underlying storage.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-10-02 03:15:45 +0800
80ddf247c block: Set max_sectors correctly for stacking devices ... Browse Code »

The topology changes unintentionally caused SAFE_MAX_SECTORS to be set
for stacking devices. Set the default limit to BLK_DEF_MAX_SECTORS and
provide SAFE_MAX_SECTORS in blk_queue_make_request() for legacy hw
drivers that depend on the old behavior.

Acked-by: Mike Snitzer
Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-10-02 03:15:45 +0800

14 Sep, 2009

1 commit

3c5820c74 block: Optimal I/O limit wrapper ... Browse Code »

Implement blk_limits_io_opt() and make blk_queue_io_opt() a wrapper
around it. DM needs this to avoid poking at the queue_limits directly.

Signed-off-by: Martin K. Petersen
Signed-off-by: Mike Snitzer
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-09-14 14:24:52 +0800

01 Aug, 2009

4 commits

7e5f5fb09 block: Update topology documentation ... Browse Code »

Update topology comments and sysfs documentation based upon discussions
with Neil Brown.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-08-01 16:24:35 +0800
70dd5bf3b block: Stack optimal I/O size ... Browse Code »

When stacking block devices ensure that optimal I/O size is scaled
accordingly.

Signed-off-by: Martin K. Petersen
Reviewed-by: Mike Snitzer
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-08-01 16:24:35 +0800
7c958e326 block: Add a wrapper for setting minimum request size without a queue ... Browse Code »

Introduce blk_limits_io_min() and make blk_queue_io_min() call it.

Signed-off-by: Mike Snitzer
Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-08-01 16:24:35 +0800
fef246672 block: Make blk_queue_stack_limits use the new stacking interface ... Browse Code »

blk_queue_stack_limits() has been superceded by blk_stack_limits() and
disk_stack_limits(). Wrap the function call for now, we'll deprecate it
later.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-08-01 16:24:35 +0800

28 Jul, 2009

1 commit

a4e7d4640 block: always assign default lock to queues ... Browse Code »

Move the assignment of a default lock below blk_init_queue() to
blk_queue_make_request(), so we also get to set the default lock
for ->make_request_fn() based drivers. This is important since the
queue flag locking requires a lock to be in place.

Signed-off-by: Jens Axboe

Jens Axboe
2009-07-28 15:07:29 +0800

19 Jun, 2009

1 commit

f740f5ca0 Fix kernel-doc parameter name typo in blk-settings.c: ... Browse Code »

Warning(block/blk-settings.c:108): No description found for parameter 'lim'
Warning(block/blk-settings.c:108): Excess function parameter 'limits' description in 'blk_set_default_limits'

Signed-off-by: Randy Dunlap
Signed-off-by: Jens Axboe

Randy Dunlap
2009-06-19 15:18:32 +0800

18 Jun, 2009

1 commit

3a02c8e81 block: Fix bounce_pfn setting ... Browse Code »

Correct stacking bounce_pfn limit setting and prevent warnings on
32-bit.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-06-18 15:56:20 +0800

16 Jun, 2009

2 commits

e475bba2f block: Introduce helper to reset queue limits to default values ... Browse Code »

DM reuses the request queue when swapping in a new device table
Introduce blk_set_default_limits() which can be used to reset the the
queue_limits prior to stacking devices.

Signed-off-by: Martin K. Petersen
Acked-by: Alasdair G Kergon
Acked-by: Mike Snitzer
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-06-16 14:23:52 +0800
0989a025d block: don't overwrite bdi->state after bdi_init() has been run ... Browse Code »

Move the defaults to where we do the init of the backing_dev_info.

Signed-off-by: Jens Axboe

Jens Axboe
2009-06-16 14:21:03 +0800

12 Jun, 2009

1 commit

8ebf97560 block: fix kernel-doc in recent block/ changes ... Browse Code »

Fix kernel-doc warnings in recently changed block/ source code.

Signed-off-by: Randy Dunlap
Signed-off-by: Linus Torvalds

Randy Dunlap
2009-06-12 11:14:23 +0800

09 Jun, 2009

2 commits

77634f33d block: Add missing bounce_pfn stacking and fix comments ... Browse Code »

DM no longer needs to set limits explicitly when calling blk_stack_limits.
Let the latter automatically deal with bounce_pfn scaling.

Fix kerneldoc variable names.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-06-09 12:23:22 +0800
9df1bb9b5 Revert "block: Fix bounce limit setting in DM" ... Browse Code »

This reverts commit a05c0205ba031c01bba33a21bf0a35920eb64833.

DM doesn't need to access the bounce_pfn directly.

Signed-off-by: Jens Axboe

Jens Axboe
2009-06-09 12:22:57 +0800

03 Jun, 2009

1 commit

a05c0205b block: Fix bounce limit setting in DM ... Browse Code »

blk_queue_bounce_limit() is more than a wrapper about the request queue
limits.bounce_pfn variable. Introduce blk_queue_bounce_pfn() which can
be called by stacking drivers that wish to set the bounce limit
explicitly.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-06-03 15:33:18 +0800

28 May, 2009

1 commit

5d85d3247 block: export blk_stack_limits() ... Browse Code »

DM needs to use blk_stack_limits(), so it needs to be exported.

Acked-by: Martin K. Petersen

Signed-off-by: Jens Axboe

Mike Snitzer
2009-05-28 17:04:53 +0800

23 May, 2009

4 commits

c72758f33 block: Export I/O topology for block devices and partitions ... Browse Code »

To support devices with physical block sizes bigger than 512 bytes we
need to ensure proper alignment. This patch adds support for exposing
I/O topology characteristics as devices are stacked.

logical_block_size is the smallest unit the device can address.

physical_block_size indicates the smallest I/O the device can write
without incurring a read-modify-write penalty.

The io_min parameter is the smallest preferred I/O size reported by
the device. In many cases this is the same as the physical block
size. However, the io_min parameter can be scaled up when stacking
(RAID5 chunk size > physical block size).

The io_opt characteristic indicates the optimal I/O size reported by
the device. This is usually the stripe width for arrays.

The alignment_offset parameter indicates the number of bytes the start
of the device/partition is offset from the device's natural alignment.
Partition tools and MD/DM utilities can use this to pad their offsets
so filesystems start on proper boundaries.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-05-23 05:22:55 +0800
025146e13 block: Move queue limits to an embedded struct ... Browse Code »

To accommodate stacking drivers that do not have an associated request
queue we're moving the limits to a separate, embedded structure.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-05-23 05:22:55 +0800
ae03bf639 block: Use accessor functions for queue limits ... Browse Code »

Convert all external users of queue limits to using wrapper functions
instead of poking the request queue variables directly.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-05-23 05:22:54 +0800
e1defc4ff block: Do away with the notion of hardsect_size ... Browse Code »
2

Until now we have had a 1:1 mapping between storage device physical
block size and the logical block sized used when addressing the device.
With SATA 4KB drives coming out that will no longer be the case. The
sector size will be 4KB but the logical block size will remain
512-bytes. Hence we need to distinguish between the physical block size
and the logical ditto.

This patch renames hardsect_size to logical_block_size.

Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Martin K. Petersen
2009-05-23 05:22:54 +0800

22 Apr, 2009

1 commit

cd0aca2d5 block: fix queue bounce limit setting ... Browse Code »

Impact: don't set GFP_DMA in q->bounce_gfp unnecessarily

All DMA address limits are expressed in terms of the last addressable
unit (byte or page) instead of one plus that. However, when
determining bounce_gfp for 64bit machines in blk_queue_bounce_limit(),
it compares the specified limit against 0x100000000UL to determine
whether it's below 4G ending up falsely setting GFP_DMA in
q->bounce_gfp.

As DMA zone is very small on x86_64, this makes larger SG_IO transfers
very eager to trigger OOM killer. Fix it. While at it, rename the
parameter to @dma_mask for clarity and convert comment to proper
winged style.

Signed-off-by: Tejun Heo
Signed-off-by: Jens Axboe

Tejun Heo
2009-04-22 14:35:09 +0800

07 Apr, 2009

1 commit

8feb4d20b pata_artop: typo ... Browse Code »

Fix a typo (this was in the original patch but was not merged when the code
fixes were for some reason)

Signed-off-by: Alan Cox
Signed-off-by: Jeff Garzik

Alan Cox
2009-04-07 08:00:29 +0800

29 Dec, 2008

1 commit

18af8b2ca block: use min_not_zero in blk_queue_stack_limits ... Browse Code »

zero is invalid for max_phys_segments, max_hw_segments, and
max_segment_size. It's better to use use min_not_zero instead of
min. min() works though (because the commit 0e435ac makes sure that
these values are set to the default values, non zero, if a queue is
initialized properly).

With this patch, blk_queue_stack_limits does the almost same thing
that dm's combine_restrictions_low() does. I think that it's easy to
remove dm's combine_restrictions_low.

Signed-off-by: FUJITA Tomonori
Signed-off-by: Jens Axboe

FUJITA Tomonori
2008-12-29 15:29:51 +0800

03 Dec, 2008

1 commit

0e435ac26 block: fix setting of max_segment_size and seg_boundary mask ... Browse Code »

Fix setting of max_segment_size and seg_boundary mask for stacked md/dm
devices.

When stacking devices (LVM over MD over SCSI) some of the request queue
parameters are not set up correctly in some cases by default, namely
max_segment_size and and seg_boundary mask.

If you create MD device over SCSI, these attributes are zeroed.

Problem become when there is over this mapping next device-mapper mapping
- queue attributes are set in DM this way:

request_queue max_segment_size seg_boundary_mask
SCSI 65536 0xffffffff
MD RAID1 0 0
LVM 65536 -1 (64bit)

Unfortunately bio_add_page (resp. bio_phys_segments) calculates number of
physical segments according to these parameters.

During the generic_make_request() is segment cout recalculated and can
increase bio->bi_phys_segments count over the allowed limit. (After
bio_clone() in stack operation.)

Thi is specially problem in CCISS driver, where it produce OOPS here

BUG_ON(creq->nr_phys_segments > MAXSGENTRIES);

(MAXSEGENTRIES is 31 by default.)

Sometimes even this command is enough to cause oops:

dd iflag=direct if=/dev// of=/dev/null bs=128000 count=10

This command generates bios with 250 sectors, allocated in 32 4k-pages
(last page uses only 1024 bytes).

For LVM layer, it allocates bio with 31 segments (still OK for CCISS),
unfortunatelly on lower layer it is recalculated to 32 segments and this
violates CCISS restriction and triggers BUG_ON().

The patch tries to fix it by:

* initializing attributes above in queue request constructor
blk_queue_make_request()

* make sure that blk_queue_stack_limits() inherits setting

(DM uses its own function to set the limits because it
blk_queue_stack_limits() was introduced later. It should probably switch
to use generic stack limit function too.)

* sets the default seg_boundary value in one place (blkdev.h)

* use this mask as default in DM (instead of -1, which differs in 64bit)

Bugs related to this:
https://bugzilla.redhat.com/show_bug.cgi?id=471639
http://bugzilla.kernel.org/show_bug.cgi?id=8672

Signed-off-by: Milan Broz
Reviewed-by: Alasdair G Kergon
Cc: Neil Brown
Cc: FUJITA Tomonori
Cc: Tejun Heo
Cc: Mike Miller
Signed-off-by: Jens Axboe

Milan Broz
2008-12-03 19:55:55 +0800

17 Oct, 2008

1 commit

713ada9ba block: move q->unplug_work initialization ... Browse Code »

modprobe loop; rmmod loop effectively creates a blk_queue and destroys it
which results in q->unplug_work being canceled without it ever being
initialized.

Therefore, move the initialization of q->unplug_work from
blk_queue_make_request() to blk_alloc_queue*().

Reported-by: Alexey Dobriyan
Signed-off-by: Peter Zijlstra
Signed-off-by: Jens Axboe

Peter Zijlstra
2008-10-17 14:46:57 +0800

09 Oct, 2008

6 commits

ef9e3facd block: add lld busy state exporting interface ... Browse Code »

This patch adds an new interface, blk_lld_busy(), to check lld's
busy state from the block layer.
blk_lld_busy() calls down into low-level drivers for the checking
if the drivers set q->lld_busy_fn() using blk_queue_lld_busy().

This resolves a performance problem on request stacking devices below.

Some drivers like scsi mid layer stop dispatching request when
they detect busy state on its low-level device like host/target/device.
It allows other requests to stay in the I/O scheduler's queue
for a chance of merging.

Request stacking drivers like request-based dm should follow
the same logic.
However, there is no generic interface for the stacked device
to check if the underlying device(s) are busy.
If the request stacking driver dispatches and submits requests to
the busy underlying device, the requests will stay in
the underlying device's queue without a chance of merging.
This causes performance problem on burst I/O load.

With this patch, busy state of the underlying device is exported
via q->lld_busy_fn(). So the request stacking driver can check it
and stop dispatching requests if busy.

The underlying device driver must return the busy state appropriately:
1: when the device driver can't process requests immediately.
0: when the device driver can process requests immediately,
including abnormal situations where the device driver needs
to kill all requests.

Signed-off-by: Kiyoshi Ueda
Signed-off-by: Jun'ichi Nomura
Cc: Andrew Morton
Signed-off-by: Jens Axboe

Kiyoshi Ueda
2008-10-09 14:56:20 +0800
242f9dcb8 block: unify request timeout handling ... Browse Code »

Right now SCSI and others do their own command timeout handling.
Move those bits to the block layer.

Instead of having a timer per command, we try to be a bit more clever
and simply have one per-queue. This avoids the overhead of having to
tear down and setup a timer for each command, so it will result in a lot
less timer fiddling.

Signed-off-by: Mike Anderson
Signed-off-by: Jens Axboe

Jens Axboe
2008-10-09 14:56:13 +0800
aeb3d3a81 block: kmalloc args reversed, small function definition fixes ... Browse Code »

Noticed by sparse:
block/blk-softirq.c:156:12: warning: symbol 'blk_softirq_init' was not declared. Should it be static?
block/genhd.c:583:28: warning: function 'bdget_disk' with external linkage has definition
block/genhd.c:659:17: warning: incorrect type in argument 1 (different base types)
block/genhd.c:659:17: expected unsigned int [unsigned] [usertype] size
block/genhd.c:659:17: got restricted gfp_t
block/genhd.c:659:29: warning: incorrect type in argument 2 (different base types)
block/genhd.c:659:29: expected restricted gfp_t [usertype] flags
block/genhd.c:659:29: got unsigned int
block: kmalloc args reversed

Signed-off-by: Harvey Harrison
Signed-off-by: Jens Axboe

Harvey Harrison
2008-10-09 14:56:11 +0800
c7c22e4d5 block: add support for IO CPU affinity ... Browse Code »

This patch adds support for controlling the IO completion CPU of
either all requests on a queue, or on a per-request basis. We export
a sysfs variable (rq_affinity) which, if set, migrates completions
of requests to the CPU that originally submitted it. A bio helper
(bio_set_completion_cpu()) is also added, so that queuers can ask
for completion on that specific CPU.

In testing, this has been show to cut the system time by as much
as 20-40% on synthetic workloads where CPU affinity is desired.

This requires a little help from the architecture, so it'll only
work as designed for archs that are using the new generic smp
helper infrastructure.

Signed-off-by: Jens Axboe

Jens Axboe
2008-10-09 14:56:09 +0800
710027a48 Add some block/ source files to the kernel-api docbook. Fix kernel-doc notation … ... Browse Code »

…in them as needed. Fix changed function parameter names. Fix typos/spellos. In comments, change REQ_SPECIAL to REQ_TYPE_SPECIAL and REQ_BLOCK_PC to REQ_TYPE_BLOCK_PC.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

Randy Dunlap
2008-10-09 14:56:03 +0800
fb2dce862 Add 'discard' request handling ... Browse Code »

Some block devices benefit from a hint that they can forget the contents
of certain sectors. Add basic support for this to the block core, along
with a 'blkdev_issue_discard()' helper function which issues such
requests.

The caller doesn't get to provide an end_io functio, since
blkdev_issue_discard() will automatically split the request up into
multiple bios if appropriate. Neither does the function wait for
completion -- it's expected that callers won't care about when, or even
_if_, the request completes. It's only a hint to the device anyway. By
definition, the file system doesn't _care_ about these sectors any more.

[With feedback from OGAWA Hirofumi and
Jens Axboe
Signed-off-by: Jens Axboe

David Woodhouse
2008-10-09 14:56:01 +0800

04 Jul, 2008

1 commit

27f8221af block: add blk_queue_update_dma_pad ... Browse Code »

This adds blk_queue_update_dma_pad to prevent LLDs from overwriting
the dma pad mask wrongly (we added blk_queue_update_dma_alignment due
to the same reason).

This also converts libata to use blk_queue_update_dma_pad instead of
blk_queue_dma_pad.

Signed-off-by: FUJITA Tomonori
Cc: Tejun Heo
Cc: Bartlomiej Zolnierkiewicz
Cc: Thomas Bogendoerfer
Cc: James Bottomley
Signed-off-by: Andrew Morton
Signed-off-by: Jens Axboe

FUJITA Tomonori
2008-07-04 15:52:13 +0800

15 May, 2008

1 commit

e7e72bf64 Remove blkdev warning triggered by using md ... Browse Code »

As setting and clearing queue flags now requires that we hold a spinlock
on the queue, and as blk_queue_stack_limits is called without that lock,
get the lock inside blk_queue_stack_limits.

For blk_queue_stack_limits to be able to find the right lock, each md
personality needs to set q->queue_lock to point to the appropriate lock.
Those personalities which didn't previously use a spin_lock, us
q->__queue_lock. So always initialise that lock when allocated.

With this in place, setting/clearing of the QUEUE_FLAG_PLUGGED bit will no
longer cause warnings as it will be clear that the proper lock is held.

Thanks to Dan Williams for review and fixing the silly bugs.

Signed-off-by: NeilBrown
Cc: Dan Williams
Cc: Jens Axboe
Cc: Alistair John Strachan
Cc: Nick Piggin
Cc: "Rafael J. Wysocki"
Cc: Jacek Luczak
Cc: Prakash Punnoor
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Neil Brown
2008-05-15 10:11:15 +0800

01 May, 2008

1 commit

24c03d47d block: remove remaining __FUNCTION__ occurrences ... Browse Code »

__FUNCTION__ is gcc specific, use __func__

Signed-off-by: Harvey Harrison
Cc: Jens Axboe
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Harvey Harrison
2008-05-01 23:04:02 +0800

29 Apr, 2008

2 commits

75ad23bc0 block: make queue flags non-atomic ... Browse Code »

We can save some atomic ops in the IO path, if we clearly define
the rules of how to modify the queue flags.

Signed-off-by: Jens Axboe

Nick Piggin
2008-04-29 20:48:33 +0800
657e93be3 unexport blk_max_pfn ... Browse Code »

blk_max_pfn can now be unexported.

Signed-off-by: Adrian Bunk
Signed-off-by: Jens Axboe

Adrian Bunk
2008-04-29 15:50:34 +0800