19 Feb, 2008

10 commits

  • Clear drain buffer before chaining if the command in question is a
    write.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Draining shouldn't be done for commands where overflow may indicate
    data integrity issues. Add dma_drain_needed callback to
    request_queue. Drain buffer is appened iff this function returns
    non-zero.

    Signed-off-by: Tejun Heo
    Cc: James Bottomley
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • With padding and draining moved into it, block layer now may extend
    requests as directed by queue parameters, so now a request has two
    sizes - the original request size and the extended size which matches
    the size of area pointed to by bios and later by sgs. The latter size
    is what lower layers are primarily interested in when allocating,
    filling up DMA tables and setting up the controller.

    Both padding and draining extend the data area to accomodate
    controller characteristics. As any controller which speaks SCSI can
    handle underflows, feeding larger data area is safe.

    So, this patch makes the primary data length field, request->data_len,
    indicate the size of full data area and add a separate length field,
    request->raw_data_len, for the unmodified request size. The latter is
    used to report to higher layer (userland) and where the original
    request size should be fed to the controller or device.

    Signed-off-by: Tejun Heo
    Cc: James Bottomley
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • DMA start address and transfer size alignment for PC requests are
    achieved using bio_copy_user() instead of bio_map_user(). This works
    because bio_copy_user() always uses full pages and block DMA alignment
    isn't allowed to go over PAGE_SIZE.

    However, the implementation didn't update the last bio of the request
    to make this padding visible to lower layers. This patch makes
    blk_rq_map_user() extend the last bio such that it includes the
    padding area and the size of area pointed to by the request is
    properly aligned.

    Signed-off-by: Tejun Heo
    Cc: James Bottomley
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Currently we fail if someone requests a valid io scheduler, but it's
    modular and not currently loaded. That can happen from a driver init
    asking for a different scheduler, or online switching through sysfs
    as requested by a user.

    This patch makes elevator_get() request_module() to attempt to load
    the appropriate module, instead of requiring that done manually.

    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • It's cumbersome to browse a radix tree from start to finish, especially
    since we modify keys when a process exits. So add a hlist for the single
    purpose of browsing over all known cfq_io_contexts, used for exit,
    io prio change, etc.

    This fixes http://bugzilla.kernel.org/show_bug.cgi?id=9948

    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • That way the interface is symmetric, and calling blk_rq_unmap_user()
    on the request wont oops.

    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • blk_settings_init() can become static.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Jens Axboe

    Adrian Bunk
     
  • blk_ioc_init() can become static.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Jens Axboe

    Adrian Bunk
     
  • request_cachep needlessly became global.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Jens Axboe

    Adrian Bunk
     

08 Feb, 2008

4 commits


01 Feb, 2008

6 commits


31 Jan, 2008

1 commit


30 Jan, 2008

9 commits


29 Jan, 2008

2 commits

  • * 'for-2.6.25' of git://git.kernel.dk/linux-2.6-block:
    block: implement drain buffers
    __bio_clone: don't calculate hw/phys segment counts
    block: allow queue dma_alignment of zero
    blktrace: Add blktrace ioctls to SCSI generic devices

    Linus Torvalds
     
  • * 'blk-end-request' of git://git.kernel.dk/linux-2.6-block: (30 commits)
    blk_end_request: changing xsysace (take 4)
    blk_end_request: changing ub (take 4)
    blk_end_request: cleanup of request completion (take 4)
    blk_end_request: cleanup 'uptodate' related code (take 4)
    blk_end_request: remove/unexport end_that_request_* (take 4)
    blk_end_request: changing scsi (take 4)
    blk_end_request: add bidi completion interface (take 4)
    blk_end_request: changing ide-cd (take 4)
    blk_end_request: add callback feature (take 4)
    blk_end_request: changing ide normal caller (take 4)
    blk_end_request: changing cpqarray (take 4)
    blk_end_request: changing cciss (take 4)
    blk_end_request: changing ide-scsi (take 4)
    blk_end_request: changing s390 (take 4)
    blk_end_request: changing mmc (take 4)
    blk_end_request: changing i2o_block (take 4)
    blk_end_request: changing viocd (take 4)
    blk_end_request: changing xen-blkfront (take 4)
    blk_end_request: changing viodasd (take 4)
    blk_end_request: changing sx8 (take 4)
    ...

    Linus Torvalds
     

28 Jan, 2008

8 commits

  • Use of inlines were a bit over the top, trim them down a bit.

    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • Currently you must be root to set idle io prio class on a process. This
    is due to the fact that the idle class is implemented as a true idle
    class, meaning that it will not make progress if someone else is
    requesting disk access. Unfortunately this means that it opens DOS
    opportunities by locking down file system resources, hence it is root
    only at the moment.

    This patch relaxes the idle class a little, by removing the truly idle
    part (which entals a grace period with associated timer). The
    modifications make the idle class as close to zero impact as can be done
    while still guarenteeing progress. This means we can relax the root only
    criteria as well.

    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • These DMA drain buffer implementations in drivers are pretty horrible
    to do in terms of manipulating the scatterlist. Plus they're being
    done at least in drivers/ide and drivers/ata, so we now have code
    duplication.

    The one use case for this, as I understand it is AHCI controllers doing
    PIO mode to mmc devices but translating this to DMA at the controller
    level.

    So, what about adding a callback to the block layer that permits the
    adding of the drain buffer for the problem devices. The idea is that
    you'd do this in slave_configure after you find one of these devices.

    The beauty of doing it in the block layer is that it quietly adds the
    drain buffer to the end of the sg list, so it automatically gets mapped
    (and unmapped) without anything unusual having to be done to the
    scatterlist in driver/scsi or drivers/ata and without any alteration to
    the transfer length.

    Signed-off-by: James Bottomley
    Signed-off-by: Jens Axboe

    James Bottomley
     
  • changes to anticipatory io scheduler for io_context sharing

    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • The io context sharing introduced a per-ioc spinlock, that would protect
    the cfq io context lookup. That is a regression from the original, since
    we never needed any locking there because the ioc/cic were process private.

    The cic lookup is changed from an rbtree construct to a radix tree, which
    we can then use RCU to make the reader side lockless. That is the performance
    critical path, modifying the radix tree is only done on process creation
    (when that process first does IO, actually) and on process exit (if that
    process has done IO).

    As it so happens, radix trees are also much faster for this type of
    lookup where the key is a pointer. It's a very sparse tree.

    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • changes in the cfq for io_context sharing

    Signed-off-by: Jens Axboe

    Nikanth Karthikesan
     
  • Detach task state from ioc, instead keep track of how many processes
    are accessing the ioc.

    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • This is where it belongs and then it doesn't take up space for a
    process that doesn't do IO.

    Signed-off-by: Jens Axboe

    Jens Axboe