24 Jul, 2011

1 commit


07 Jul, 2011

1 commit

  • Due to the recently identified overflow in read_capacity_16() it was
    possible for max_discard_sectors to be zero but still have discards
    enabled on the associated device's queue.

    Eliminate the possibility for blkdev_issue_discard to infinitely loop.

    Interestingly this issue wasn't identified until a device, whose
    discard_granularity was 0 due to read_capacity_16 overflow, was consumed
    by blk_stack_limits() to construct limits for a higher-level DM
    multipath device. The multipath device's resulting limits never had the
    discard limits stacked because blk_stack_limits() will only do so if
    the bottom device's discard_granularity != 0. This resulted in the
    multipath device's limits.max_discard_sectors being 0.

    Signed-off-by: Mike Snitzer
    Signed-off-by: Jens Axboe

    Mike Snitzer
     

07 May, 2011

3 commits

  • Currently we return -EOPNOTSUPP in blkdev_issue_discard() if any of the
    bio fails due to underlying device not supporting discard request.
    However, if the device is for example dm device composed of devices
    which some of them support discard and some of them does not, it is ok
    for some bios to fail with EOPNOTSUPP, but it does not mean that discard
    is not supported at all.

    This commit removes the check for bios failed with EOPNOTSUPP and change
    blkdev_issue_discard() to return operation not supported if and only if
    the device does not actually supports it, not just part of the device as
    some bios might indicate.

    This change also fixes problem with BLKDISCARD ioctl() which now works
    correctly on such dm devices.

    Signed-off-by: Lukas Czerner
    CC: Jens Axboe
    CC: Jeff Moyer
    Signed-off-by: Jens Axboe

    Lukas Czerner
     
  • In blkdev_issue_zeroout() we are submitting regular WRITE bios, so we do
    not need to check for -EOPNOTSUPP specifically in case of error. Also
    there is no need to have label submit: because there is no way to jump
    out from the while cycle without an error and we really want to exit,
    rather than try again. And also remove the check for (sz == 0) since at
    that point sz can never be zero.

    Signed-off-by: Lukas Czerner
    Reviewed-by: Jeff Moyer
    CC: Dmitry Monakhov
    CC: Jens Axboe
    Signed-off-by: Jens Axboe

    Lukas Czerner
     
  • Currently we are waiting for every submitted REQ_DISCARD bio separately,
    but it can have unwanted consequences of repeatedly flushing the queue,
    so we rather submit bios in batches and wait for the entire batch, hence
    narrowing the window of other ios going in.

    Use bio_batch_end_io() and struct bio_batch for that purpose, the same
    is used by blkdev_issue_zeroout(). Also change bio_batch_end_io() so we
    always set !BIO_UPTODATE in the case of error and remove the check for
    bb, since we are the only user of this function and we always set this.

    Remove bio_get()/bio_put() from the blkdev_issue_discard() since
    bio_alloc() and bio_batch_end_io() is doing the same thing, hence it is
    not needed anymore.

    I have done simple dd testing with surprising results. The script I have
    used is:

    for i in $(seq 10); do
    echo $i
    dd if=/dev/sdb1 of=/dev/sdc1 bs=4k &
    sleep 5
    done
    /usr/bin/time -f %e ./blkdiscard /dev/sdc1

    Running time of BLKDISCARD on the whole device:
    with patch without patch
    0.95 15.58

    So we can see that in this artificial test the kernel with the patch
    applied is approx 16x faster in discarding the device.

    Signed-off-by: Lukas Czerner
    CC: Dmitry Monakhov
    CC: Jens Axboe
    CC: Jeff Moyer
    Signed-off-by: Jens Axboe

    Lukas Czerner
     

25 Mar, 2011

1 commit

  • * 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits)
    Documentation/iostats.txt: bit-size reference etc.
    cfq-iosched: removing unnecessary think time checking
    cfq-iosched: Don't clear queue stats when preempt.
    blk-throttle: Reset group slice when limits are changed
    blk-cgroup: Only give unaccounted_time under debug
    cfq-iosched: Don't set active queue in preempt
    block: fix non-atomic access to genhd inflight structures
    block: attempt to merge with existing requests on plug flush
    block: NULL dereference on error path in __blkdev_get()
    cfq-iosched: Don't update group weights when on service tree
    fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away
    block: Require subsystems to explicitly allocate bio_set integrity mempool
    jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
    jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
    fs: make fsync_buffers_list() plug
    mm: make generic_writepages() use plugging
    blk-cgroup: Add unaccounted time to timeslice_used.
    block: fixup plugging stubs for !CONFIG_BLOCK
    block: remove obsolete comments for blkdev_issue_zeroout.
    blktrace: Use rq->cmd_flags directly in blk_add_trace_rq.
    ...

    Fix up conflicts in fs/{aio.c,super.c}

    Linus Torvalds
     

12 Mar, 2011

1 commit


11 Mar, 2011

1 commit

  • BZ29402
    https://bugzilla.kernel.org/show_bug.cgi?id=29402

    We can hit serious mis-synchronization in bio completion path of
    blkdev_issue_zeroout() leading to a panic.

    The problem is that when we are going to wait_for_completion() in
    blkdev_issue_zeroout() we check if the bb.done equals issued (number of
    submitted bios). If it does, we can skip the wait_for_completition()
    and just out of the function since there is nothing to wait for.
    However, there is a ordering problem because bio_batch_end_io() is
    calling atomic_inc(&bb->done) before complete(), hence it might seem to
    blkdev_issue_zeroout() that all bios has been completed and exit. At
    this point when bio_batch_end_io() is going to call complete(bb->wait),
    bb and wait does not longer exist since it was allocated on stack in
    blkdev_issue_zeroout() ==> panic!

    (thread 1) (thread 2)
    bio_batch_end_io() blkdev_issue_zeroout()
    if(bb) { ...
    if (bb->end_io) ...
    bb->end_io(bio, err); ...
    atomic_inc(&bb->done); ...
    ... while (issued != atomic_read(&bb.done))
    ... (let issued == bb.done)
    ... (do the rest of the function)
    ... return ret;
    complete(bb->wait);
    ^^^^^^^^
    panic

    We can fix this easily by simplifying bio_batch and completion counting.

    Also remove bio_end_io_t *end_io since it is not used.

    Signed-off-by: Lukas Czerner
    Reported-by: Eric Whitney
    Tested-by: Eric Whitney
    Reviewed-by: Jeff Moyer
    CC: Dmitry Monakhov
    Signed-off-by: Jens Axboe

    Lukas Czerner
     

02 Mar, 2011

1 commit


17 Sep, 2010

1 commit

  • All the blkdev_issue_* helpers can only sanely be used for synchronous
    caller. To issue cache flushes or barriers asynchronously the caller needs
    to set up a bio by itself with a completion callback to move the asynchronous
    state machine ahead. So drop the BLKDEV_IFL_WAIT flag that is always
    specified when calling blkdev_issue_* and also remove the now unused flags
    argument to blkdev_issue_flush and blkdev_issue_zeroout. For
    blkdev_issue_discard we need to keep it for the secure discard flag, which
    gains a more descriptive name and loses the bitops vs flag confusion.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

10 Sep, 2010

1 commit

  • Remove support for barriers on discards, which is unused now. Also
    remove the DISCARD_NOBARRIER I/O type in favour of just setting the
    rw flags up locally in blkdev_issue_discard.

    tj: Also remove DISCARD_SECURE and use REQ_SECURE directly.

    Signed-off-by: Christoph Hellwig
    Acked-by: Mike Snitzer
    Signed-off-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

12 Aug, 2010

1 commit

  • Secure discard is the same as discard except that all copies of the
    discarded sectors (perhaps created by garbage collection) must also be
    erased.

    Signed-off-by: Adrian Hunter
    Acked-by: Jens Axboe
    Cc: Kyungmin Park
    Cc: Madhusudhan Chikkature
    Cc: Christoph Hellwig
    Cc: Ben Gardiner
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Hunter
     

09 Aug, 2010

1 commit


08 Aug, 2010

2 commits

  • If the queue doesn't have a limit set, or it just set UINT_MAX like
    we default to, we coud be sending down a discard request that isn't
    of the correct granularity if the block size is > 512b.

    Fix this by adjusting max_discard_sectors down to the proper
    alignment.

    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • Allocating a fixed payload for discard requests always was a horrible hack,
    and it's not coming to byte us when adding support for discard in DM/MD.

    So change the code to leave the allocation of a payload to the lowlevel
    driver. Unfortunately that means we'll need another hack, which allows
    us to update the various block layer length fields indicating that we
    have a payload. Instead of hiding this in sd.c, which we already partially
    do for UNMAP support add a documented helper in the core block layer for it.

    Signed-off-by: Christoph Hellwig
    Acked-by: Mike Snitzer
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

29 Apr, 2010

3 commits