22 Jun, 2009

2 commits

  • If -EOPNOTSUPP was returned and the request was a barrier request, retry it
    without barrier.

    Retry all regions for now. Barriers are submitted only for one-region requests,
    so it doesn't matter. (In the future, retries can be limited to the actual
    regions that failed.)

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • Add another field, eopnotsupp_bits. It is subset of error_bits, representing
    regions that returned -EOPNOTSUPP. (The bit is set in both error_bits and
    eopnotsupp_bits).

    This value will be used in further patches.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     

03 Apr, 2009

1 commit

  • If someone sends signal to a process performing synchronous dm-io call,
    the kernel may crash.

    The function sync_io attempts to exit with -EINTR if it has pending signal,
    however the structure "io" is allocated on stack, so already submitted io
    requests end up touching unallocated stack space and corrupting kernel memory.

    sync_io sets its state to TASK_UNINTERRUPTIBLE, so the signal can't break out
    of io_schedule() --- however, if the signal was pending before sync_io entered
    while (1) loop, the corruption of kernel memory will happen.

    There is no way to cancel in-progress IOs, so the best solution is to ignore
    signals at this point.

    Cc: stable@kernel.org
    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     

17 Mar, 2009

1 commit

  • dm-io calls bio_get_nr_vecs to get the maximum number of pages to use
    for a given device. It allocates one additional bio_vec to use
    internally but failed to respect BIO_MAX_PAGES, so fix this.

    This was the likely cause of:
    https://bugzilla.redhat.com/show_bug.cgi?id=173153

    Cc: stable@kernel.org
    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     

18 Feb, 2009

1 commit


29 Dec, 2008

1 commit

  • Instead of having a global bio slab cache, add a reference to one
    in each bio_set that is created. This allows for personalized slabs
    in each bio_set, so that they can have bios of different sizes.

    This means we can personalize the bios we return. File systems may
    want to embed the bio inside another structure, to avoid allocation
    more items (and stuffing them in ->bi_private) after the get a bio.
    Or we may want to embed a number of bio_vecs directly at the end
    of a bio, to avoid doing two allocations to return a bio. This is now
    possible.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

22 Oct, 2008

1 commit


25 Apr, 2008

4 commits

  • Remove an avoidable 3ms delay on some dm-raid1 and kcopyd I/O.

    It is specified that any submitted bio without BIO_RW_SYNC flag may plug the
    queue (i.e. block the requests from being dispatched to the physical device).

    The queue is unplugged when the caller calls blk_unplug() function. Usually, the
    sequence is that someone calls submit_bh to submit IO on a buffer. The IO plugs
    the queue and waits (to be possibly joined with other adjacent bios). Then, when
    the caller calls wait_on_buffer(), it unplugs the queue and submits the IOs to
    the disk.

    This was happenning:

    When doing O_SYNC writes, function fsync_buffers_list() submits a list of
    bios to dm_raid1, the bios are added to dm_raid1 write queue and kmirrord is
    woken up.

    fsync_buffers_list() calls wait_on_buffer(). That unplugs the queue, but
    there are no bios on the device queue as they are still in the dm_raid1 queue.

    wait_on_buffer() starts waiting until the IO is finished.

    kmirrord is scheduled, kmirrord takes bios and submits them to the devices.

    The submitted bio plugs the harddisk queue but there is no one to unplug it.
    (The process that called wait_on_buffer() is already sleeping.)

    So there is a 3ms timeout, after which the queues on the harddisks are
    unplugged and requests are processed.

    This 3ms timeout meant that in certain workloads (e.g. O_SYNC, 8kb writes),
    dm-raid1 is 10 times slower than md raid1.

    Every time we submit something asynchronously via dm_io, we must unplug the
    queue actually to send the request to the device.

    This patch adds an unplug call to kmirrord - while processing requests, it keeps
    the queue plugged (so that adjacent bios can be merged); when it finishes
    processing all the bios, it unplugs the queue to submit the bios.

    It also fixes kcopyd which has the same potential problem. All kcopyd requests
    are submitted with BIO_RW_SYNC.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon
    Acked-by: Jens Axboe

    Mikulas Patocka
     
  • Publish the dm-io, dm-log and dm-kcopyd headers in include/linux.

    Signed-off-by: Alasdair G Kergon

    Alasdair G Kergon
     
  • Clean up the dm-io interface to prepare for publishing it in include/linux.

    Signed-off-by: Heinz Mauelshagen
    Signed-off-by: Alasdair G Kergon

    Heinz Mauelshagen
     
  • Rename 'error' to 'error_bits' for clarity.

    Signed-off-by: Alasdair G Kergon

    Alasdair G Kergon
     

29 Mar, 2008

1 commit


10 Oct, 2007

1 commit

  • As bi_end_io is only called once when the reqeust is complete,
    the 'size' argument is now redundant. Remove it.

    Now there is no need for bio_endio to subtract the size completed
    from bi_size. So don't do that either.

    While we are at it, change bi_end_io to return void.

    Signed-off-by: Neil Brown
    Signed-off-by: Jens Axboe

    NeilBrown
     

13 Jul, 2007

1 commit

  • bio_alloc_bioset() will return NULL if 'num_vecs' is too large.
    Use bio_get_nr_vecs() to get estimation of maximum number.

    Cc: stable@kernel.org
    Signed-off-by: "Jun'ichi Nomura"
    Signed-off-by: Alasdair G Kergon
    Signed-off-by: Linus Torvalds

    Jun'ichi Nomura
     

10 May, 2007

4 commits

  • Remove old dm-io interface.

    Signed-off-by: Milan Broz
    Signed-off-by: Alasdair G Kergon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Milan Broz
     
  • Add a new API to dm-io.c that uses a private mempool and bio_set for each
    client.

    The new functions to use are dm_io_client_create(), dm_io_client_destroy(),
    dm_io_client_resize() and dm_io().

    Signed-off-by: Heinz Mauelshagen
    Signed-off-by: Alasdair G Kergon
    Cc: Milan Broz
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Heinz Mauelshagen
     
  • Introduce struct dm_io_client to prepare for per-client mempools and bio_sets.

    Temporary functions bios() and io_pool() choose between the per-client
    structures and the global ones so the old and new interfaces can co-exist.

    Make error_bits optional.

    Signed-off-by: Heinz Mauelshagen
    Signed-off-by: Alasdair G Kergon
    Cc: Milan Broz
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Heinz Mauelshagen
     
  • Delay decrementing the 'struct io' reference count until after the bio has
    been freed so that a bio destructor function may reference it. Required by a
    later patch.

    Signed-off-by: Heinz Mauelshagen
    Signed-off-by: Alasdair G Kergon
    Cc: Milan Broz
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Heinz Mauelshagen
     

30 Apr, 2007

1 commit

  • Currently we scale the mempool sizes depending on memory installed
    in the machine, except for the bio pool itself which sits at a fixed
    256 entry pre-allocation.

    There's really no point in "optimizing" this OOM path, we just need
    enough preallocated to make progress. A single unit is enough, lets
    scale it down to 2 just to be on the safe side.

    This patch saves ~150kb of pinned kernel memory on a 32-bit box.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

09 Dec, 2006

1 commit

  • The existing code allocates an extra slot in bi_io_vec[] and uses it to store
    the region number.

    This patch hides the extra slot from bio_add_page() so the region number can't
    get overwritten.

    Also remove a hard-coded SECTOR_SHIFT and fix a typo in a comment.

    Signed-off-by: Heinz Mauelshagen
    Signed-off-by: Alasdair G Kergon
    Cc: Milan Broz
    Cc: dm-devel@redhat.com
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Heinz Mauelshagen
     

27 Mar, 2006

1 commit

  • This patch changes several mempool users, all of which are basically just
    wrappers around kmalloc(), to use the common mempool_kmalloc/kfree, rather
    than their own wrapper function, removing a bunch of duplicated code.

    Signed-off-by: Matthew Dobson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Dobson
     

09 Oct, 2005

1 commit

  • - added typedef unsigned int __nocast gfp_t;

    - replaced __nocast uses for gfp flags with gfp_t - it gives exactly
    the same warnings as far as sparse is concerned, doesn't change
    generated code (from gcc point of view we replaced unsigned int with
    typedef) and documents what's going on far better.

    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Al Viro
     

08 Sep, 2005

1 commit

  • Jens:

    ->bi_set is totally unnecessary bloat of struct bio. Just define a proper
    destructor for the bio and it already knows what bio_set it belongs too.

    Peter:

    Fixed the bugs.

    Signed-off-by: Jens Axboe
    Signed-off-by: Peter Osterlund
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Osterlund
     

17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds