Doug / smarc-fsl-linux-kernel | Embedian Git Server

01 Jul, 2008

1 commit

d585d0b9d block: Fix the starving writes bug in the anticipatory IO scheduler ... Browse Code »

AS scheduler alternates between issuing read and write batches. It does
the batch switch only after all requests from the previous batch are
completed.

When switching to a write batch, if there is an on-going read request,
it waits for its completion and indicates its intention of switching by
setting ad->changed_batch and the new direction but does not update the
batch_expire_time for the new write batch which it does in the case of
no previous pending requests.
On completion of the read request, it sees that we were waiting for the
switch and schedules work for kblockd right away and resets the
ad->changed_data flag.
Now when kblockd enters dispatch_request where it is expected to pick
up a write request, it in turn ends the write batch because the
batch_expire_timer was not updated and shows the expire timestamp for
the previous batch.

This results in the write starvation for all the cases where there is
the intention for switching to a write batch, but there is a previous
in-flight read request and the batch gets reverted to a read_batch
right away.

This also holds true in the reverse case (switching from a write batch
to a read batch with an in-flight write request).

I've checked that this bug exists on 2.6.11, 2.6.18, 2.6.24 and
linux-2.6-block git HEAD. I've tested the fix on x86 platforms with
SCSI drives where the driver asks for the next request while a current
request is in-flight.

This patch is based off linux-2.6-block git HEAD.

Bug reproduction:
A simple scenario which reproduces this bug is:
- dd if=/dev/hda3 of=/dev/null &
- lilo
The lilo takes forever to complete.

This can also be reproduced fairly easily with the earlier dd and
another test
program doing msync().

The example test program below should print out a message after every
iteration
but it simply hangs forever. With this bugfix it makes forward progress.

====
Example test program using msync() (thanks to suleiman AT google DOT
com)

inline uint64_t
rdtsc(void)
{
int64_t tsc;

__asm __volatile("rdtsc" : "=A" (tsc));
return (tsc);
}

int
main(int argc, char **argv)
{
struct stat st;
uint64_t e, s, t;
char *p, q;
long i;
int fd;

if (argc < 2) {
printf("Usage: %s \n", argv[0]);
return (1);
}

if ((fd = open(argv[1], O_RDWR | O_NOATIME)) < 0)
err(1, "open");

if (fstat(fd, &st) < 0)
err(1, "fstat");

p = mmap(NULL, st.st_size, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0);

t = 0;
for (i = 0; i < 1000; i++) {
*p = 0;
msync(p, 4096, MS_SYNC);
s = rdtsc();
*p = 0;
__asm __volatile(""::: "memory");
e = rdtsc();
if (argc > 2)
printf("%d: %lld cycles %jd %jd\n",
i, e - s, (intmax_t)s, (intmax_t)e);
t += e - s;
}
printf("average time: %lld cycles\n", t / 1000);
return (0);
}

Cc:
Acked-by: Nick Piggin
Signed-off-by: Jens Axboe

Divyesh Shah
2008-07-01 15:06:42 +0800

13 Jun, 2008

1 commit

14a73f547 block: disable IRQs until data is written to relay channel ... Browse Code »

As we may run relay_reserve from interrupt context we must always disable
IRQs. This is because a call to relay_reserve may expose previously written
data to use space.

Updated new message code and an old but related comment.

Signed-off-by: Carl Henrik Lunde
Signed-off-by: Jens Axboe
Signed-off-by: Linus Torvalds

Carl Henrik Lunde
2008-06-13 02:20:57 +0800

10 Jun, 2008

1 commit

d5791d13b Fix invalid access errors in blk_lookup_devt ... Browse Code »

Commit 30f2f0eb4bd2c43d10a8b0d872c6e5ad8f31c9a0 ("block: do_mounts -
accept root=") extended blk_lookup_devt() to be
able to look up partitions that had not yet been registered, but in the
process made the assumption that the '&block_class.devices' list only
contains disk devices and that you can do 'dev_to_disk(dev)' on them.

That isn't actually true. The block_class device list also contains the
partitions we've discovered so far, and you can't just do a
'dev_to_disk()' on those.

So make sure to only work on devices that block/genhd.c has registered
itself, something we can test by checking the 'dev->type' member. This
makes the loop in blk_lookup_devt() match the other such loops in this
file.

[ We may want to do an alternate version that knows to handle _either_
whole-disk devices or partitions, but for now this is the minimal fix
for a series of crashes reported by Mariusz Kozlowski in

http://lkml.org/lkml/2008/5/25/25

and Ingo in

http://lkml.org/lkml/2008/6/9/39 ]

Reported-by: Mariusz Kozlowski
Reported-by: Ingo Molnar
Cc: Neil Brown
Cc: Joao Luis Meloni Assirati
Acked-by: Kay Sievers
Cc: Greg Kroah-Hartman
Signed-off-by: Linus Torvalds

Linus Torvalds
2008-06-10 01:06:24 +0800

28 May, 2008

6 commits

d6de8be71 cfq-iosched: fix RCU problem in cfq_cic_lookup() ... Browse Code »

cfq_cic_lookup() needs to properly protect ioc->ioc_data before
dereferencing it and also exclude updaters of ioc->ioc_data as well.

Also add a number of comments documenting why the existing RCU usage
is OK.

Thanks a lot to "Paul E. McKenney" for
review and comments!

Signed-off-by: Jens Axboe

Jens Axboe
2008-05-28 20:49:28 +0800
64565911c block: make blktrace use per-cpu buffers for message notes ... Browse Code »

Currently it uses a single static char array, but that risks
being corrupted when multiple users issue message notes at the
same time. Make the buffers dynamically allocated when the trace
is setup and make them per-cpu instead.

The default max message size of 1k is also very large, the
interface is mainly for small text notes. So shrink it to 128 bytes.

Signed-off-by: Jens Axboe

Jens Axboe
2008-05-28 20:49:27 +0800
4722dc52a Added in elevator switch message to blktrace stream ... Browse Code »

Signed-off-by: Alan D. Brunelle
Signed-off-by: Jens Axboe

Alan D. Brunelle
2008-05-28 20:49:27 +0800
9d5f09a42 Added in MESSAGE notes for blktraces ... Browse Code »

Allows messages to be inserted into blktrace streams.

Signed-off-by: Alan D. Brunelle
Signed-off-by: Jens Axboe

Alan D. Brunelle
2008-05-28 20:49:27 +0800
be754d2c2 block: reorder cfq_queue to save space on 64bit builds ... Browse Code »

saves 8 bytes of padding & increases objects/slab from 30 to 32 on my
AMD64 config

Signed-off-by: Richard Kennedy
Signed-off-by: Jens Axboe

Richard Kennedy
2008-05-28 20:49:27 +0800
05caf8dbc block: Move the second call to get_request to the end of the loop ... Browse Code »

In function get_request_wait, the second call to get_request could be
moved to the end of the while loop, because if the first call to
get_request fails, the second call will fail without sleep.

Signed-off-by: Zhang Yanmin
Signed-off-by: Jens Axboe

Zhang, Yanmin
2008-05-28 20:49:27 +0800

15 May, 2008

2 commits

e7e72bf64 Remove blkdev warning triggered by using md ... Browse Code »

As setting and clearing queue flags now requires that we hold a spinlock
on the queue, and as blk_queue_stack_limits is called without that lock,
get the lock inside blk_queue_stack_limits.

For blk_queue_stack_limits to be able to find the right lock, each md
personality needs to set q->queue_lock to point to the appropriate lock.
Those personalities which didn't previously use a spin_lock, us
q->__queue_lock. So always initialise that lock when allocated.

With this in place, setting/clearing of the QUEUE_FLAG_PLUGGED bit will no
longer cause warnings as it will be clear that the proper lock is held.

Thanks to Dan Williams for review and fixing the silly bugs.

Signed-off-by: NeilBrown
Cc: Dan Williams
Cc: Jens Axboe
Cc: Alistair John Strachan
Cc: Nick Piggin
Cc: "Rafael J. Wysocki"
Cc: Jacek Luczak
Cc: Prakash Punnoor
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Neil Brown
2008-05-15 10:11:15 +0800
30f2f0eb4 block: do_mounts - accept root=<non-existant partition> ... Browse Code »

Some devices, like md, may create partitions only at first access,
so allow root= to be set to a valid non-existant partition of an
existing disk. This applies only to non-initramfs root mounting.

This fixes a regression from 2.6.24 which did allow this to happen and
broke some users machines :(

Acked-by: Neil Brown
Tested-by: Joao Luis Meloni Assirati
Cc: stable
Signed-off-by: Kay Sievers
Signed-off-by: Greg Kroah-Hartman

Kay Sievers
2008-05-15 01:37:57 +0800

13 May, 2008

1 commit

f36f21ecc Fix misuses of bdevname() ... Browse Code »

bdevname() fills the buffer that it is given as a parameter, so calling
strcpy() or snprintf() on the returned value is redundant (and probably not
guaranteed to work - I don't think strcpy and snprintf support overlapping
buffers.)

Signed-off-by: Jean Delvare
Cc: Stephen Tweedie
Cc: Jens Axboe
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jean Delvare
2008-05-13 23:02:26 +0800

07 May, 2008

7 commits

28f13702f block: avoid duplicate calls to get_part() in disk stat code ... Browse Code »

get_part() is fairly expensive, as it O(N) loops over partitions
to find the right one. In lots of normal IO paths we end up looking
up the partition twice, to make matters even worse. Change the
stat add code to accept a passed in partition instead.

Signed-off-by: Jens Axboe

Jens Axboe
2008-05-07 16:15:46 +0800
6d63c2755 cfq-iosched: make io priorities inherit CPU scheduling class as well as nice ... Browse Code »

We currently set all processes to the best-effort scheduling class,
regardless of what CPU scheduling class they belong to. Improve that
so that we correctly track idle and rt scheduling classes as well.

Signed-off-by: Jens Axboe

Jens Axboe
2008-05-07 15:51:23 +0800
dbaf2c003 block: optimize generic_unplug_device() ... Browse Code »

Original patch from Mikulas Patocka

Mike Anderson was doing an OLTP benchmark on a computer with 48 physical
disks mapped to one logical device via device mapper.

He found that there was a slowdown on request_queue->lock in function
generic_unplug_device. The slowdown is caused by the fact that when some
code calls unplug on the device mapper, device mapper calls unplug on all
physical disks. These unplug calls take the lock, find that the queue is
already unplugged, release the lock and exit.

With the below patch, performance of the benchmark was increased by 18%
(the whole OLTP application, not just block layer microbenchmarks).

So I'm submitting this patch for upstream. I think the patch is correct,
because when more threads call simultaneously plug and unplug, it is
unspecified, if the queue is or isn't plugged (so the patch can't make
this worse). And the caller that plugged the queue should unplug it
anyway. (if it doesn't, there's 3ms timeout).

Signed-off-by: Mikulas Patocka
Signed-off-by: Jens Axboe

Jens Axboe
2008-05-07 15:48:17 +0800
2cdf79caf block: get rid of likely/unlikely predictions in merge logic ... Browse Code »

They tend to depend a lot on the workload, so not a clear-cut
likely or unlikely fit.

Signed-off-by: Jens Axboe

Jens Axboe
2008-05-07 15:33:55 +0800
07416d29b cfq-iosched: fix RCU race in the cfq io_context destructor handling ... Browse Code »

put_io_context() drops the RCU read lock before calling into cfq_dtor(),
however we need to hold off freeing there before grabbing and
dereferencing the first object on the list.

So extend the rcu_read_lock() scope to cover the calling of cfq_dtor(),
and optimize cfq_free_io_context() to use a new variant for
call_for_each_cic() that assumes the RCU read lock is already held.

Hit in the wild by Alexey Dobriyan

Signed-off-by: Jens Axboe

Jens Axboe
2008-05-07 15:28:57 +0800
aa94b5371 block: adjust tagging function queue bit locking ... Browse Code »

For most initialization purposes, calling blk_queue_init_tags() without
the queue lock held is OK. Only if called for resizing an existing map
must the lock be held. Ditto for tag cleanup, the maps are reference
counted.

So switch the general queue flag setting to the unlocked variant, but
retain the locked variant for resizing.

Signed-off-by: Jens Axboe

Jens Axboe
2008-05-07 15:27:43 +0800
bf0f97025 block: sysfs store function needs to grab queue_lock and use queue_flag_*() ... Browse Code »

Concurrency isn't a big deal here since we have requests in flight
at this point, but do the locked variant to set a better example.

Signed-off-by: Jens Axboe

Jens Axboe
2008-05-07 15:09:39 +0800

03 May, 2008

2 commits

d626e3bf7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6:
[SCSI] aic94xx: fix section mismatch
[SCSI] u14-34f: Fix 32bit only problem
[SCSI] dpt_i2o: sysfs code
[SCSI] dpt_i2o: 64 bit support
[SCSI] dpt_i2o: move from virt_to_bus/bus_to_virt to dma_alloc_coherent
[SCSI] dpt_i2o: use standard __init / __exit code
[SCSI] megaraid_sas: fix suspend/resume sections
[SCSI] aacraid: Add Power Management support
[SCSI] aacraid: Fix jbod operations scan issues
[SCSI] aacraid: Fix warning about macro side-effects
[SCSI] add support for variable length extended commands
[SCSI] Let scsi_cmnd->cmnd use request->cmd buffer
[SCSI] bsg: add large command support
[SCSI] aacraid: Fix down_interruptible() to check the return value correctly
[SCSI] megaraid_sas; Update the Version and Changelog
[SCSI] ibmvscsi: Handle non SCSI error status
[SCSI] bug fix for free list handling
[SCSI] ipr: Rename ipr's state scsi host attribute to prevent collisions
[SCSI] megaraid_mbox: fix Dell CERC firmware problem

Linus Torvalds
2008-05-03 04:52:35 +0800
db4742dd8 [SCSI] add support for variable length extended commands ... Browse Code »

Add support for variable-length, extended, and vendor specific
CDBs to scsi-ml. It is now possible for initiators and ULD's
to issue these types of commands. LLDs need not change much.
All they need is to raise the .max_cmd_len to the longest command
they support (see iscsi patch).

- clean-up some code paths that did not expect commands to be
larger than 16, and change cmd_len members' type to short as
char is not enough.

Signed-off-by: Boaz Harrosh
Signed-off-by: Benny Halevy
Signed-off-by: James Bottomley

Boaz Harrosh
2008-05-03 00:33:25 +0800

02 May, 2008

1 commit

9f5de6b10 [SCSI] bsg: add large command support ... Browse Code »

This enables bsg to handle the request length larger than BLK_MAX_CDB
(mainly for the variable length CDB format).

Signed-off-by: FUJITA Tomonori
Acked-by: Jens Axboe
Signed-off-by: James Bottomley

FUJITA Tomonori
2008-05-02 23:17:35 +0800

01 May, 2008

1 commit

24c03d47d block: remove remaining __FUNCTION__ occurrences ... Browse Code »

__FUNCTION__ is gcc specific, use __func__

Signed-off-by: Harvey Harrison
Cc: Jens Axboe
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Harvey Harrison
2008-05-01 23:04:02 +0800

30 Apr, 2008

1 commit

cf0ca9fe5 mm: bdi: export BDI attributes in sysfs ... Browse Code »

Provide a place in sysfs (/sys/class/bdi) for the backing_dev_info object.
This allows us to see and set the various BDI specific variables.

In particular this properly exposes the read-ahead window for all relevant
users and /sys/block//queue/read_ahead_kb should be deprecated.

With patient help from Kay Sievers and Greg KH

[mszeredi@suse.cz]

- split off NFS and FUSE changes into separate patches
- document new sysfs attributes under Documentation/ABI
- do bdi_class_init as a core_initcall, otherwise the "default" BDI
won't be initialized
- remove bdi_init_fmt macro, it's not used very much

[akpm@linux-foundation.org: fix ia64 warning]
Signed-off-by: Peter Zijlstra
Cc: Kay Sievers
Acked-by: Greg KH
Cc: Trond Myklebust
Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2008-04-30 23:29:49 +0800

29 Apr, 2008

11 commits

ac9fafa12 block: Skip I/O merges when disabled ... Browse Code »

The block I/O + elevator + I/O scheduler code spend a lot of time trying
to merge I/Os -- rightfully so under "normal" circumstances. However,
if one were to know that the incoming I/O stream was /very/ random in
nature, the cycles are wasted.

This patch adds a per-request_queue tunable that (when set) disables
merge attempts (beyond the simple one-hit cache check), thus freeing up
a non-trivial amount of CPU cycles.

Signed-off-by: Alan D. Brunelle
Signed-off-by: Jens Axboe

Alan D. Brunelle
2008-04-29 20:48:55 +0800
d7e3c3249 block: add large command support ... Browse Code »

This patch changes rq->cmd from the static array to a pointer to
support large commands.

We rarely handle large commands. So for optimization, a struct request
still has a static array for a command. rq_init sets rq->cmd pointer
to the static array.

Signed-off-by: FUJITA Tomonori
Cc: Jens Axboe
Signed-off-by: Jens Axboe

FUJITA Tomonori
2008-04-29 20:48:55 +0800
d34c87e4b block: replace sizeof(rq->cmd) with BLK_MAX_CDB ... Browse Code »

This is a preparation for changing rq->cmd from the static array to a
pointer.

Signed-off-by: FUJITA Tomonori
Cc: Boaz Harrosh
Cc: Bartlomiej Zolnierkiewicz
Cc: Jens Axboe
Signed-off-by: Jens Axboe

FUJITA Tomonori
2008-04-29 20:48:55 +0800
2a4aa30c5 block: rename and export rq_init() ... Browse Code »

This rename rq_init() blk_rq_init() and export it. Any path that hands
the request to the block layer needs to call it to initialize the
request.

This is a preparation for large command support, which needs to
initialize the request in a proper way (that is, just doing a memset()
will not work).

Signed-off-by: FUJITA Tomonori
Cc: Jens Axboe
Signed-off-by: Jens Axboe

FUJITA Tomonori
2008-04-29 20:48:55 +0800
992b5bcee block: no need to initialize rq->cmd with blk_get_request ... Browse Code »

blk_get_request initializes rq->cmd (rq_init does) so the users don't
need to do that.

The purpose of this patch is to remove sizeof(rq->cmd) and &rq->cmd,
as a preparation for large command support, which changes rq->cmd from
the static array to a pointer. sizeof(rq->cmd) will not make sense and
&rq->cmd won't work.

Signed-off-by: FUJITA Tomonori
Cc: James Bottomley
Cc: Alasdair G Kergon
Cc: Jens Axboe
Signed-off-by: Jens Axboe

FUJITA Tomonori
2008-04-29 20:48:55 +0800
6f6a036e6 block/blk-barrier.c:blk_ordered_cur_seq() mustn't be inline ... Browse Code »

This patch fixes the following build error with UML and gcc 4.3:

...
CC block/blk-barrier.o
/home/bunk/linux/kernel-2.6/git/linux-2.6/block/blk-barrier.c: In function ‘blk_do_ordered’:
/home/bunk/linux/kernel-2.6/git/linux-2.6/block/blk-barrier.c:57: sorry, unimplemented: inlining failed in call to ‘blk_ordered_cur_seq’: function body not available
/home/bunk/linux/kernel-2.6/git/linux-2.6/block/blk-barrier.c:252: sorry, unimplemented: called from here
/home/bunk/linux/kernel-2.6/git/linux-2.6/block/blk-barrier.c:57: sorry, unimplemented: inlining failed in call to ‘blk_ordered_cur_seq’: function body not available
/home/bunk/linux/kernel-2.6/git/linux-2.6/block/blk-barrier.c:253: sorry, unimplemented: called from here
make[2]: *** [block/blk-barrier.o] Error 1

Signed-off-by: Adrian Bunk
Signed-off-by: Jens Axboe

Adrian Bunk
2008-04-29 20:48:54 +0800
72ed0bf60 block/elevator.c:elv_rq_merge_ok() mustn't be inline ... Browse Code »

This patch fixes the following build error with UML and gcc 4.3:

...
CC block/elevator.o
/home/bunk/linux/kernel-2.6/git/linux-2.6/block/elevator.c: In function ‘elv_merge’:
/home/bunk/linux/kernel-2.6/git/linux-2.6/block/elevator.c:73: sorry, unimplemented: inlining failed in call to ‘elv_rq_merge_ok’: function body not available
/home/bunk/linux/kernel-2.6/git/linux-2.6/block/elevator.c:103: sorry, unimplemented: called from here
/home/bunk/linux/kernel-2.6/git/linux-2.6/block/elevator.c:73: sorry, unimplemented: inlining failed in call to ‘elv_rq_merge_ok’: function body not available
/home/bunk/linux/kernel-2.6/git/linux-2.6/block/elevator.c:495: sorry, unimplemented: called from here
make[2]: *** [block/elevator.o] Error 1
make[1]: *** [block] Error 2

Signed-off-by: Adrian Bunk
Signed-off-by: Jens Axboe

Adrian Bunk
2008-04-29 20:48:54 +0800
75ad23bc0 block: make queue flags non-atomic ... Browse Code »

We can save some atomic ops in the IO path, if we clearly define
the rules of how to modify the queue flags.

Signed-off-by: Jens Axboe

Nick Piggin
2008-04-29 20:48:33 +0800
68154e90c block: add dma alignment and padding support to blk_rq_map_kern ... Browse Code »

This patch adds bio_copy_kern similar to
bio_copy_user. blk_rq_map_kern uses bio_copy_kern instead of
bio_map_kern if necessary.

bio_copy_kern uses temporary pages and the bi_end_io callback frees
these pages. bio_copy_kern saves the original kernel buffer at
bio->bi_private it doesn't use something like struct bio_map_data to
store the information about the caller.

Signed-off-by: FUJITA Tomonori
Cc: Tejun Heo
Signed-off-by: Jens Axboe

FUJITA Tomonori
2008-04-29 15:50:34 +0800
657e93be3 unexport blk_max_pfn ... Browse Code »

blk_max_pfn can now be unexported.

Signed-off-by: Adrian Bunk
Signed-off-by: Jens Axboe

Adrian Bunk
2008-04-29 15:50:34 +0800
1afb20f30 block: make rq_init() do a full memset() ... Browse Code »

This requires moving rq_init() from get_request() to blk_alloc_request().
The upside is that we can now require an rq_init() from any path that
wishes to hand the request to the block layer.

rq_init() will be exported for the code that uses struct request
without blk_get_request.

This is a preparation for large command support, which needs to
initialize struct request in a proper way (that is, just doing a
memset() will not work).

Signed-off-by: FUJITA Tomonori
Signed-off-by: Jens Axboe

FUJITA Tomonori
2008-04-29 15:50:34 +0800

23 Apr, 2008

1 commit

97f46ae45 [SCSI] bsg: add release callback support ... Browse Code »

This patch adds release callback support, which is called when a bsg
device goes away. bsg_register_queue() takes a pointer to a callback
function. This feature is useful for stuff like sas_host that can't
use the release callback in struct device.

If a caller doesn't need bsg's release callback, it can call
bsg_register_queue() with NULL pointer (e.g. scsi devices can use
release callback in struct device so they don't need bsg's callback).

With this patch, bsg uses kref for refcounts on bsg devices instead of
get/put_device in fops->open/release. bsg calls put_device and the
caller's release callback (if it was registered) in kref_put's
release.

Signed-off-by: FUJITA Tomonori
Signed-off-by: James Bottomley

FUJITA Tomonori
2008-04-23 04:16:32 +0800

22 Apr, 2008

1 commit

548453fd1 Merge branch 'for-2.6.26' of git://git.kernel.dk/linux-2.6-block ... Browse Code »

* 'for-2.6.26' of git://git.kernel.dk/linux-2.6-block:
block: fix blk_register_queue() return value
block: fix memory hotplug and bouncing in block layer
block: replace remaining __FUNCTION__ occurrences
Kconfig: clean up block/Kconfig help descriptions
cciss: fix warning oops on rmmod of driver
cciss: Fix race between disk-adding code and interrupt handler
block: move the padding adjustment to blk_rq_map_sg
block: add bio_copy_user_iov support to blk_rq_map_user_iov
block: convert bio_copy_user to bio_copy_user_iov
loop: manage partitions in disk image
cdrom: use kmalloced buffers instead of buffers on stack
cdrom: make unregister_cdrom() return void
cdrom: use list_head for cdrom_device_info list
cdrom: protect cdrom_device_info list by mutex
cdrom: cleanup hardcoded error-code
cdrom: remove ifdef CONFIG_SYSCTL

Linus Torvalds
2008-04-22 07:03:40 +0800

21 Apr, 2008

3 commits

fb1997463 block: fix blk_register_queue() return value ... Browse Code »

blk_register_queue() returns -ENXIO when queue->request_fn is NULL. But there
are some block drivers that call blk_register_queue() via add_disk() with
queue->request_fn == NULL. (For example, brd, loop)

Although no one checks return value of blk_register_queue(), this patch makes
it return 0 instead of -ENXIO when queue->request_fn is NULL,

Also this patch adds warning when blk_register_queue() and
blk_unregister_queue() are called with queue == NULL rather than ignore
invalid usage silently.

Signed-off-by: Akinobu Mita
Cc: Jens Axboe
Signed-off-by: Andrew Morton
Signed-off-by: Jens Axboe

Akinobu Mita
2008-04-21 15:51:06 +0800
ee86418d3 Kconfig: clean up block/Kconfig help descriptions ... Browse Code »

Modify the help descriptions of block/Kconfig for clarity, accuracy and
consistency.

Refactor the BLOCK description a bit. The wording "This permits ... to be
removed" isn't quite right; the block layer is removed when the option is
disabled, whereas most descriptions talk about what happens when the option is
enabled. Reformat the list of what is affected by disabling the block layer.

Add more examples of large block devices to LBD and strive for technical
accuracy; block devices of size _exactly_ 2TB require CONFIG_LBD, not only
"bigger than 2TB". Also try to say (perhaps not very clearly) that the config
option is only needed when you want to have individual block devices of size
>= 2TB, for example if you had 3 x 1TB disks in your computer you'd have a
total storage size of 3TB but you wouldn't need the option unless you want to
aggregate those disks into a RAID or LVM.

Improve terminology and grammar on BLK_DEV_IO_TRACE.

I also added the boilerplate "If unsure, say N" to most options.

Precisely say "2TB and larger" for LSF.

Indent the help text for BLK_DEV_BSG by 2 spaces in accordance with the
standard.

Signed-off-by: Nick Andrew
Cc: "Randy.Dunlap"
Signed-off-by: Andrew Morton
Signed-off-by: Jens Axboe

Nick Andrew
2008-04-21 15:51:04 +0800
f18573abc block: move the padding adjustment to blk_rq_map_sg ... Browse Code »

blk_rq_map_user adjusts bi_size of the last bio. It breaks the rule
that req->data_len (the true data length) is equal to sum(bio). It
broke the scsi command completion code.

commit e97a294ef6938512b655b1abf17656cf2b26f709 was introduced to fix
the above issue. However, the partial completion code doesn't work
with it. The commit is also a layer violation (scsi mid-layer should
not know about the block layer's padding).

This patch moves the padding adjustment to blk_rq_map_sg (suggested by
James). The padding works like the drain buffer. This patch breaks the
rule that req->data_len is equal to sum(sg), however, the drain buffer
already broke it. So this patch just restores the rule that
req->data_len is equal to sub(bio) without breaking anything new.

Now when a low level driver needs padding, blk_rq_map_user and
blk_rq_map_user_iov guarantee there's enough room for padding.
blk_rq_map_sg can safely extend the last entry of a scatter list.

blk_rq_map_sg must extend the last entry of a scatter list only for a
request that got through bio_copy_user_iov. This patches introduces
new REQ_COPY_USER flag.

Signed-off-by: FUJITA Tomonori
Cc: Tejun Heo
Cc: Mike Christie
Cc: James Bottomley
Signed-off-by: Jens Axboe

FUJITA Tomonori
2008-04-21 15:50:08 +0800