27 Aug, 2008

1 commit

  • cmd_filter works only for the block layer SG_IO with SCSI block
    devices. It breaks scsi/sg.c, bsg, and the block layer SG_IO with SCSI
    character devices (such as st). We hit a kernel crash with them.

    The problem is that cmd_filter code accesses to gendisk (having struct
    blk_scsi_cmd_filter) via inode->i_bdev->bd_disk. It works for only
    SCSI block device files. With character device files, inode->i_bdev
    leads you to struct cdev. inode->i_bdev->bd_disk->blk_scsi_cmd_filter
    isn't safe.

    SCSI ULDs don't expose gendisk; they keep it private. bsg needs to be
    independent on any protocols. We shouldn't change ULDs to expose their
    gendisk.

    This patch moves struct blk_scsi_cmd_filter from gendisk to
    request_queue, a common object, which eveyone can access to.

    The user interface doesn't change; users can change the filters via
    /sys/block/. gendisk has a pointer to request_queue so the cmd_filter
    code accesses to struct blk_scsi_cmd_filter.

    Signed-off-by: FUJITA Tomonori
    Signed-off-by: Jens Axboe

    FUJITA Tomonori
     

02 Aug, 2008

1 commit


16 Jul, 2008

2 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (80 commits)
    ide-floppy: fix unfortunate function naming
    ide-tape: unify idetape_create_read/write_cmd
    ide: add ide_pc_intr() helper
    ide-{floppy,scsi}: read Status Register before stopping DMA engine
    ide-scsi: add more debugging to idescsi_pc_intr()
    ide-scsi: use pc->callback
    ide-floppy: add more debugging to idefloppy_pc_intr()
    ide-tape: always log debug info in idetape_pc_intr() if debugging is enabled
    ide-tape: add ide_tape_io_buffers() helper
    ide-tape: factor out DSC handling from idetape_pc_intr()
    ide-{floppy,tape}: move checking of ->failed_pc to ->callback
    ide: add ide_issue_pc() helper
    ide: add PC_FLAG_DRQ_INTERRUPT pc flag
    ide-scsi: move idescsi_map_sg() call out from idescsi_issue_pc()
    ide: add ide_transfer_pc() helper
    ide-scsi: set drive->scsi flag for devices handled by the driver
    ide-{cd,floppy,tape}: remove checking for drive->scsi
    ide: add PC_FLAG_ZIP_DRIVE pc flag
    ide-tape: factor out waiting for good ireason from idetape_transfer_pc()
    ide-tape: set PC_FLAG_DMA_IN_PROGRESS flag in idetape_transfer_pc()
    ...

    Linus Torvalds
     
  • Some uses blk_put_request asymmetrically, that is, they uses it with
    requests that not allocated by blk_get_request. As a result,
    blk_put_request has a hack to catch a NULL request_queue. Now such
    callers are fixed (they use blk_get_request properly). So we can
    safely remove the hack in blk_put_request.

    Signed-off-by: FUJITA Tomonori
    Cc: Borislav Petkov
    Signed-off-by: Jens Axboe
    Signed-off-by: Bartlomiej Zolnierkiewicz

    FUJITA Tomonori
     

15 Jul, 2008

1 commit


03 Jul, 2008

2 commits

  • Add test_and_clear and test_and_set.

    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • Some block devices support verifying the integrity of requests by way
    of checksums or other protection information that is submitted along
    with the I/O.

    This patch implements support for generating and verifying integrity
    metadata, as well as correctly merging, splitting and cloning bios and
    requests that have this extra information attached.

    See Documentation/block/data-integrity.txt for more information.

    Signed-off-by: Martin K. Petersen
    Signed-off-by: Jens Axboe

    Martin K. Petersen
     

16 Jun, 2008

1 commit


28 May, 2008

1 commit


25 May, 2008

1 commit

  • As git-grep shows, open_softirq() is always called with the last argument
    being NULL

    block/blk-core.c: open_softirq(BLOCK_SOFTIRQ, blk_done_softirq, NULL);
    kernel/hrtimer.c: open_softirq(HRTIMER_SOFTIRQ, run_hrtimer_softirq, NULL);
    kernel/rcuclassic.c: open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL);
    kernel/rcupreempt.c: open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL);
    kernel/sched.c: open_softirq(SCHED_SOFTIRQ, run_rebalance_domains, NULL);
    kernel/softirq.c: open_softirq(TASKLET_SOFTIRQ, tasklet_action, NULL);
    kernel/softirq.c: open_softirq(HI_SOFTIRQ, tasklet_hi_action, NULL);
    kernel/timer.c: open_softirq(TIMER_SOFTIRQ, run_timer_softirq, NULL);
    net/core/dev.c: open_softirq(NET_TX_SOFTIRQ, net_tx_action, NULL);
    net/core/dev.c: open_softirq(NET_RX_SOFTIRQ, net_rx_action, NULL);

    This observation has already been made by Matthew Wilcox in June 2002
    (http://www.cs.helsinki.fi/linux/linux-kernel/2002-25/0687.html)

    "I notice that none of the current softirq routines use the data element
    passed to them."

    and the situation hasn't changed since them. So it appears we can safely
    remove that extra argument to save 128 (54) bytes of kernel data (text).

    Signed-off-by: Carlos R. Mafra
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Carlos R. Mafra
     

15 May, 2008

1 commit

  • As setting and clearing queue flags now requires that we hold a spinlock
    on the queue, and as blk_queue_stack_limits is called without that lock,
    get the lock inside blk_queue_stack_limits.

    For blk_queue_stack_limits to be able to find the right lock, each md
    personality needs to set q->queue_lock to point to the appropriate lock.
    Those personalities which didn't previously use a spin_lock, us
    q->__queue_lock. So always initialise that lock when allocated.

    With this in place, setting/clearing of the QUEUE_FLAG_PLUGGED bit will no
    longer cause warnings as it will be clear that the proper lock is held.

    Thanks to Dan Williams for review and fixing the silly bugs.

    Signed-off-by: NeilBrown
    Cc: Dan Williams
    Cc: Jens Axboe
    Cc: Alistair John Strachan
    Cc: Nick Piggin
    Cc: "Rafael J. Wysocki"
    Cc: Jacek Luczak
    Cc: Prakash Punnoor
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Neil Brown
     

07 May, 2008

2 commits

  • get_part() is fairly expensive, as it O(N) loops over partitions
    to find the right one. In lots of normal IO paths we end up looking
    up the partition twice, to make matters even worse. Change the
    stat add code to accept a passed in partition instead.

    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • Original patch from Mikulas Patocka

    Mike Anderson was doing an OLTP benchmark on a computer with 48 physical
    disks mapped to one logical device via device mapper.

    He found that there was a slowdown on request_queue->lock in function
    generic_unplug_device. The slowdown is caused by the fact that when some
    code calls unplug on the device mapper, device mapper calls unplug on all
    physical disks. These unplug calls take the lock, find that the queue is
    already unplugged, release the lock and exit.

    With the below patch, performance of the benchmark was increased by 18%
    (the whole OLTP application, not just block layer microbenchmarks).

    So I'm submitting this patch for upstream. I think the patch is correct,
    because when more threads call simultaneously plug and unplug, it is
    unspecified, if the queue is or isn't plugged (so the patch can't make
    this worse). And the caller that plugged the queue should unplug it
    anyway. (if it doesn't, there's 3ms timeout).

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Jens Axboe

    Jens Axboe
     

01 May, 2008

1 commit


29 Apr, 2008

5 commits

  • This patch changes rq->cmd from the static array to a pointer to
    support large commands.

    We rarely handle large commands. So for optimization, a struct request
    still has a static array for a command. rq_init sets rq->cmd pointer
    to the static array.

    Signed-off-by: FUJITA Tomonori
    Cc: Jens Axboe
    Signed-off-by: Jens Axboe

    FUJITA Tomonori
     
  • This is a preparation for changing rq->cmd from the static array to a
    pointer.

    Signed-off-by: FUJITA Tomonori
    Cc: Boaz Harrosh
    Cc: Bartlomiej Zolnierkiewicz
    Cc: Jens Axboe
    Signed-off-by: Jens Axboe

    FUJITA Tomonori
     
  • This rename rq_init() blk_rq_init() and export it. Any path that hands
    the request to the block layer needs to call it to initialize the
    request.

    This is a preparation for large command support, which needs to
    initialize the request in a proper way (that is, just doing a memset()
    will not work).

    Signed-off-by: FUJITA Tomonori
    Cc: Jens Axboe
    Signed-off-by: Jens Axboe

    FUJITA Tomonori
     
  • We can save some atomic ops in the IO path, if we clearly define
    the rules of how to modify the queue flags.

    Signed-off-by: Jens Axboe

    Nick Piggin
     
  • This requires moving rq_init() from get_request() to blk_alloc_request().
    The upside is that we can now require an rq_init() from any path that
    wishes to hand the request to the block layer.

    rq_init() will be exported for the code that uses struct request
    without blk_get_request.

    This is a preparation for large command support, which needs to
    initialize struct request in a proper way (that is, just doing a
    memset() will not work).

    Signed-off-by: FUJITA Tomonori
    Signed-off-by: Jens Axboe

    FUJITA Tomonori
     

04 Mar, 2008

3 commits

  • This patch removes the unused exports of blk_{get,put}_queue.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Jens Axboe

    Adrian Bunk
     
  • The meaning of rq->data_len was changed to the length of an allocated
    buffer from the true data length. It breaks SG_IO friends and
    bsg. This patch restores the meaning of rq->data_len to the true data
    length and adds rq->extra_len to store an extended length (due to
    drain buffer and padding).

    This patch also removes the code to update bio in blk_rq_map_user
    introduced by the commit 40b01b9bbdf51ae543a04744283bf2d56c4a6afa.
    The commit adjusts bio according to memory alignment
    (queue_dma_alignment). However, memory alignment is NOT padding
    alignment. This adjustment also breaks SG_IO friends and bsg. Padding
    alignment needs to be fixed in a proper way (by a separate patch).

    Signed-off-by: FUJITA Tomonori
    Signed-off-by: Jens Axboe

    FUJITA Tomonori
     
  • kernel-doc for block/:
    - add missing parameters
    - fix one function's parameter list (remove blank line)
    - add 2 source files to docbook for non-exported kernel-doc functions

    Signed-off-by: Randy Dunlap
    Signed-off-by: Jens Axboe

    Randy Dunlap
     

19 Feb, 2008

2 commits

  • With padding and draining moved into it, block layer now may extend
    requests as directed by queue parameters, so now a request has two
    sizes - the original request size and the extended size which matches
    the size of area pointed to by bios and later by sgs. The latter size
    is what lower layers are primarily interested in when allocating,
    filling up DMA tables and setting up the controller.

    Both padding and draining extend the data area to accomodate
    controller characteristics. As any controller which speaks SCSI can
    handle underflows, feeding larger data area is safe.

    So, this patch makes the primary data length field, request->data_len,
    indicate the size of full data area and add a separate length field,
    request->raw_data_len, for the unmodified request size. The latter is
    used to report to higher layer (userland) and where the original
    request size should be fed to the controller or device.

    Signed-off-by: Tejun Heo
    Cc: James Bottomley
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • request_cachep needlessly became global.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Jens Axboe

    Adrian Bunk
     

08 Feb, 2008

3 commits


01 Feb, 2008

2 commits


30 Jan, 2008

6 commits