15 Sep, 2018

1 commit

  • [ Upstream commit d6c02a9beb67f13d5f14f23e72fa9981e8b84477 ]

    In commit ed996a52c868 ("block: simplify and cleanup bvec pool
    handling"), the value of the slab index is incremented by one in
    bvec_alloc() after the allocation is done to indicate an index value of
    0 does not need to be later freed.

    bvec_nr_vecs() was not updated accordingly, and thus returns the wrong
    value. Decrement idx before performing the lookup.

    Fixes: ed996a52c868 ("block: simplify and cleanup bvec pool handling")
    Signed-off-by: Greg Edwards
    Signed-off-by: Jens Axboe
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Greg Edwards
     

03 Aug, 2018

2 commits

  • commit 5151842b9d8732d4cbfa8400b40bff894f501b2f upstream.

    After the bio has been updated to represent the remaining sectors, reset
    bi_done so bio_rewind_iter() does not rewind further than it should.

    This resolves a bio_integrity_process() failure on reads where the
    original request was split.

    Fixes: 63573e359d05 ("bio-integrity: Restore original iterator on verify stage")
    Signed-off-by: Greg Edwards
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Greg Edwards
     
  • commit b403ea2404889e1227812fa9657667a1deb9c694 upstream.

    If the last page of the bio is not "full", the length of the last
    vector slot needs to be corrected. This slot has the index
    (bio->bi_vcnt - 1), but only in bio->bi_io_vec. In the "bv" helper
    array, which is shifted by the value of bio->bi_vcnt at function
    invocation, the correct index is (nr_pages - 1).

    v2: improved readability following suggestions from Ming Lei.
    v3: followed a formatting suggestion from Christoph Hellwig.

    Fixes: 2cefe4dbaadf ("block: add bio_iov_iter_get_pages()")
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Ming Lei
    Reviewed-by: Jan Kara
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Martin Wilck
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Martin Wilck
     

26 Apr, 2018

1 commit

  • [ Upstream commit 20d59023c5ec4426284af492808bcea1f39787ef ]

    We inadvertently set it again on the source bio, but we need
    to set it on the new split bio instead.

    Fixes: fbbaf700e7b1 ("block: trace completion of all bios.")
    Signed-off-by: Goldwyn Rodrigues
    Signed-off-by: Jens Axboe
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Goldwyn Rodrigues
     

08 Apr, 2018

1 commit


30 Dec, 2017

1 commit

  • commit 111be883981748acc9a56e855c8336404a8e787c upstream.

    If a bio is throttled and split after throttling, the bio could be
    resubmited and enters the throttling again. This will cause part of the
    bio to be charged multiple times. If the cgroup has an IO limit, the
    double charge will significantly harm the performance. The bio split
    becomes quite common after arbitrary bio size change.

    To fix this, we always set the BIO_THROTTLED flag if a bio is throttled.
    If the bio is cloned/split, we copy the flag to new bio too to avoid a
    double charge. However, cloned bio could be directed to a new disk,
    keeping the flag be a problem. The observation is we always set new disk
    for the bio in this case, so we can clear the flag in bio_set_dev().

    This issue exists for a long time, arbitrary bio size change just makes
    it worse, so this should go into stable at least since v4.2.

    V1-> V2: Not add extra field in bio based on discussion with Tejun

    Cc: Vivek Goyal
    Acked-by: Tejun Heo
    Signed-off-by: Shaohua Li
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Shaohua Li
     

24 Nov, 2017

1 commit

  • commit 62530ed8b1d07a45dec94d46e521c0c6c2d476e6 upstream.

    A new field was introduced in 74d46992e0d9, bi_partno, instead of using
    bdev->bd_contains and encoding the partition information in the bi_bdev
    field. __bio_clone_fast was changed to copy the disk information, but
    not the partition information. At minimum, this regressed bcache and
    caused data corruption.

    Signed-off-by: Michael Lyle
    Fixes: 74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions index")
    Reported-by: Pavel Goran
    Reported-by: Campbell Steven
    Reviewed-by: Coly Li
    Reviewed-by: Ming Lei
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Michael Lyle
     

11 Oct, 2017

3 commits

  • Since "block: support large requests in blk_rq_map_user_iov" we
    started to call it with partially drained iter; that works fine
    on the write side, but reads create a copy of iter for completion
    time. And that needs to take the possibility of ->iov_iter != 0
    into account...

    Cc: stable@vger.kernel.org #v4.5+
    Signed-off-by: Al Viro

    Al Viro
     
  • we need to take care of failure exit as well - pages already
    in bio should be dropped by analogue of bio_unmap_pages(),
    since their refcounts had been bumped only once per reference
    in bio.

    Cc: stable@vger.kernel.org
    Signed-off-by: Al Viro

    Al Viro
     
  • bio_map_user_iov and bio_unmap_user do unbalanced pages refcounting if
    IO vector has small consecutive buffers belonging to the same page.
    bio_add_pc_page merges them into one, but the page reference is never
    dropped.

    Cc: stable@vger.kernel.org
    Signed-off-by: Vitaly Mayatskikh
    Signed-off-by: Al Viro

    Vitaly Mayatskikh
     

08 Sep, 2017

1 commit

  • Pull MD updates from Shaohua Li:
    "This update mainly fixes bugs:

    - Make raid5 ppl support several ppl from Pawel

    - Several raid5-cache bug fixes from Song

    - Bitmap fixes from Neil and Me

    - One raid1/10 regression fix since 4.12 from Me

    - Other small fixes and cleanup"

    * tag 'md/4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
    md/bitmap: disable bitmap_resize for file-backed bitmaps.
    raid5-ppl: Recovery support for multiple partial parity logs
    md: Runtime support for multiple ppls
    md/raid0: attach correct cgroup info in bio
    lib/raid6: align AVX512 constants to 512 bits, not bytes
    raid5: remove raid5_build_block
    md/r5cache: call mddev_lock/unlock() in r5c_journal_mode_show
    md: replace seq_release_private with seq_release
    md: notify about new spare disk in the container
    md/raid1/10: reset bio allocated from mempool
    md/raid5: release/flush io in raid5_do_work()
    md/bitmap: copy correct data for bitmap super

    Linus Torvalds
     

26 Aug, 2017

1 commit


24 Aug, 2017

1 commit

  • This way we don't need a block_device structure to submit I/O. The
    block_device has different life time rules from the gendisk and
    request_queue and is usually only available when the block device node
    is open. Other callers need to explicitly create one (e.g. the lightnvm
    passthrough code, or the new nvme multipathing code).

    For the actual I/O path all that we need is the gendisk, which exists
    once per block device. But given that the block layer also does
    partition remapping we additionally need a partition index, which is
    used for said remapping in generic_make_request.

    Note that all the block drivers generally want request_queue or
    sometimes the gendisk, so this removes a layer of indirection all
    over the stack.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

10 Aug, 2017

1 commit


02 Aug, 2017

1 commit


11 Jul, 2017

1 commit

  • bio_free isn't a good place to free cgroup info. There are a
    lot of cases bio is allocated in special way (for example, in stack) and
    never gets called by bio_put hence bio_free, we are leaking memory. This
    patch moves the free to bio endio, which should be called anyway. The
    bio_uninit call in bio_free is kept, in case the bio never gets called
    bio endio.

    This assumes ->bi_end_io() doesn't access cgroup info, which seems true
    in my audit.

    This along with Christoph's integrity patch should fix the memory leak
    issue.

    Cc: Christoph Hellwig
    Signed-off-by: Shaohua Li
    Signed-off-by: Jens Axboe

    Shaohua Li
     

04 Jul, 2017

4 commits

  • And instead call directly into the integrity code from bio_end_io.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • bio_integrity_trim inherent it's interface from bio_trim and accept
    offset and size, but this API is error prone because data offset
    must always be insync with bio's data offset. That is why we have
    integrity update hook in bio_advance()

    So only meaningful values are: offset == 0, sectors == bio_sectors(bio)
    Let's just remove them completely.

    Reviewed-by: Hannes Reinecke
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Martin K. Petersen
    Signed-off-by: Dmitry Monakhov
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Dmitry Monakhov
     
  • Reviewed-by: Christoph Hellwig
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Martin K. Petersen
    Signed-off-by: Dmitry Monakhov
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Dmitry Monakhov
     
  • Pull core block/IO updates from Jens Axboe:
    "This is the main pull request for the block layer for 4.13. Not a huge
    round in terms of features, but there's a lot of churn related to some
    core cleanups.

    Note this depends on the UUID tree pull request, that Christoph
    already sent out.

    This pull request contains:

    - A series from Christoph, unifying the error/stats codes in the
    block layer. We now use blk_status_t everywhere, instead of using
    different schemes for different places.

    - Also from Christoph, some cleanups around request allocation and IO
    scheduler interactions in blk-mq.

    - And yet another series from Christoph, cleaning up how we handle
    and do bounce buffering in the block layer.

    - A blk-mq debugfs series from Bart, further improving on the support
    we have for exporting internal information to aid debugging IO
    hangs or stalls.

    - Also from Bart, a series that cleans up the request initialization
    differences across types of devices.

    - A series from Goldwyn Rodrigues, allowing the block layer to return
    failure if we will block and the user asked for non-blocking.

    - Patch from Hannes for supporting setting loop devices block size to
    that of the underlying device.

    - Two series of patches from Javier, fixing various issues with
    lightnvm, particular around pblk.

    - A series from me, adding support for write hints. This comes with
    NVMe support as well, so applications can help guide data placement
    on flash to improve performance, latencies, and write
    amplification.

    - A series from Ming, improving and hardening blk-mq support for
    stopping/starting and quiescing hardware queues.

    - Two pull requests for NVMe updates. Nothing major on the feature
    side, but lots of cleanups and bug fixes. From the usual crew.

    - A series from Neil Brown, greatly improving the bio rescue set
    support. Most notably, this kills the bio rescue work queues, if we
    don't really need them.

    - Lots of other little bug fixes that are all over the place"

    * 'for-4.13/block' of git://git.kernel.dk/linux-block: (217 commits)
    lightnvm: pblk: set line bitmap check under debug
    lightnvm: pblk: verify that cache read is still valid
    lightnvm: pblk: add initialization check
    lightnvm: pblk: remove target using async. I/Os
    lightnvm: pblk: use vmalloc for GC data buffer
    lightnvm: pblk: use right metadata buffer for recovery
    lightnvm: pblk: schedule if data is not ready
    lightnvm: pblk: remove unused return variable
    lightnvm: pblk: fix double-free on pblk init
    lightnvm: pblk: fix bad le64 assignations
    nvme: Makefile: remove dead build rule
    blk-mq: map all HWQ also in hyperthreaded system
    nvmet-rdma: register ib_client to not deadlock in device removal
    nvme_fc: fix error recovery on link down.
    nvmet_fc: fix crashes on bad opcodes
    nvme_fc: Fix crash when nvme controller connection fails.
    nvme_fc: replace ioabort msleep loop with completion
    nvme_fc: fix double calls to nvme_cleanup_cmd()
    nvme-fabrics: verify that a controller returns the correct NQN
    nvme: simplify nvme_dev_attrs_are_visible
    ...

    Linus Torvalds
     

29 Jun, 2017

1 commit

  • Wen reports significant memory leaks with DIF and O_DIRECT:

    "With nvme devive + T10 enabled, On a system it has 256GB and started
    logging /proc/meminfo & /proc/slabinfo for every minute and in an hour
    it increased by 15968128 kB or ~15+GB.. Approximately 256 MB / minute
    leaking.

    /proc/meminfo | grep SUnreclaim...

    SUnreclaim: 6752128 kB
    SUnreclaim: 6874880 kB
    SUnreclaim: 7238080 kB
    ....
    SUnreclaim: 22307264 kB
    SUnreclaim: 22485888 kB
    SUnreclaim: 22720256 kB

    When testcases with T10 enabled call into __blkdev_direct_IO_simple,
    code doesn't free memory allocated by bio_integrity_alloc. The patch
    fixes the issue. HTX has been run with +60 hours without failure."

    Since __blkdev_direct_IO_simple() allocates the bio on the stack, it
    doesn't go through the regular bio free. This means that any ancillary
    data allocated with the bio through the stack is not freed. Hence, we
    can leak the integrity data associated with the bio, if the device is
    using DIF/DIX.

    Fix this by providing a bio_uninit() and export it, so that we can use
    it to free this data. Note that this is a minimal fix for this issue.
    Any current user of bio's that are allocated outside of
    bio_alloc_bioset() suffers from this issue, most notably some drivers.
    We will fix those in a more comprehensive patch for 4.13. This also
    means that the commit marked as being fixed by this isn't the real
    culprit, it's just the most obvious one out there.

    Fixes: 542ff7bf18c6 ("block: new direct I/O implementation")
    Reported-by: Wen Xiong
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Jens Axboe
     

28 Jun, 2017

1 commit

  • No functional changes in this patch, we just use up some holes
    in the bio and request structures to define a write hint that
    we psas down the stack.

    Ensure that we don't merge requests that have different life time
    hints assigned to them, and that we inherit the write hint when
    cloning a bio.

    Reviewed-by: Martin K. Petersen
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Jens Axboe
     

19 Jun, 2017

3 commits

  • bio_clone() is no longer used.
    Only bio_clone_bioset() or bio_clone_fast().
    This is for the best, as bio_clone() used fs_bio_set,
    and filesystems are unlikely to want to use bio_clone().

    So remove bio_clone() and all references.
    This includes a fix to some incorrect documentation.

    Reviewed-by: Christoph Hellwig
    Reviewed-by: Ming Lei
    Signed-off-by: NeilBrown
    Signed-off-by: Jens Axboe

    NeilBrown
     
  • This patch converts bioset_create() to not create a workqueue by
    default, so alloctions will never trigger punt_bios_to_rescuer(). It
    also introduces a new flag BIOSET_NEED_RESCUER which tells
    bioset_create() to preserve the old behavior.

    All callers of bioset_create() that are inside block device drivers,
    are given the BIOSET_NEED_RESCUER flag.

    biosets used by filesystems or other top-level users do not
    need rescuing as the bio can never be queued behind other
    bios. This includes fs_bio_set, blkdev_dio_pool,
    btrfs_bioset, xfs_ioend_bioset, and one allocated by
    target_core_iblock.c.

    biosets used by md/raid do not need rescuing as
    their usage was recently audited and revised to never
    risk deadlock.

    It is hoped that most, if not all, of the remaining biosets
    can end up being the non-rescued version.

    Reviewed-by: Christoph Hellwig
    Credit-to: Ming Lei (minor fixes)
    Reviewed-by: Ming Lei
    Signed-off-by: NeilBrown
    Signed-off-by: Jens Axboe

    NeilBrown
     
  • "flags" arguments are often seen as good API design as they allow
    easy extensibility.
    bioset_create_nobvec() is implemented internally as a variation in
    flags passed to __bioset_create().

    To support future extension, make the internal structure part of the
    API.
    i.e. add a 'flags' argument to bioset_create() and discard
    bioset_create_nobvec().

    Note that the bio_split allocations in drivers/md/raid* do not need
    the bvec mempool - they should have used bioset_create_nobvec().

    Suggested-by: Christoph Hellwig
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Ming Lei
    Signed-off-by: NeilBrown
    Signed-off-by: Jens Axboe

    NeilBrown
     

16 Jun, 2017

1 commit

  • This patch fixes two sparse warnings introduced by the "dedicated
    error codes for the block layer V3" patch series. These changes
    have not been tested.

    Signed-off-by: Bart Van Assche
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Bart Van Assche
     

09 Jun, 2017

1 commit

  • Replace bi_error with a new bi_status to allow for a clear conversion.
    Note that device mapper overloaded bi_error with a private value, which
    we'll have to keep arround at least for now and thus propagate to a
    proper blk_status_t value.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

02 May, 2017

1 commit


12 Apr, 2017

1 commit


07 Apr, 2017

1 commit

  • Currently only dm and md/raid5 bios trigger
    trace_block_bio_complete(). Now that we have bio_chain() and
    bio_inc_remaining(), it is not possible, in general, for a driver to
    know when the bio is really complete. Only bio_endio() knows that.

    So move the trace_block_bio_complete() call to bio_endio().

    Now trace_block_bio_complete() pairs with trace_block_bio_queue().
    Any bio for which a 'queue' event is traced, will subsequently
    generate a 'complete' event.

    There are a few cases where completion tracing is not wanted.
    1/ If blk_update_request() has already generated a completion
    trace event at the 'request' level, there is no point generating
    one at the bio level too. In this case the bi_sector and bi_size
    will have changed, so the bio level event would be wrong

    2/ If the bio hasn't actually been queued yet, but is being aborted
    early, then a trace event could be confusing. Some filesystems
    call bio_endio() but do not want tracing.

    3/ The bio_integrity code interposes itself by replacing bi_end_io,
    then restoring it and calling bio_endio() again. This would produce
    two identical trace events if left like that.

    To handle these, we introduce a flag BIO_TRACE_COMPLETION and only
    produce the trace event when this is set.
    We address point 1 above by clearing the flag in blk_update_request().
    We address point 2 above by only setting the flag when
    generic_make_request() is called.
    We address point 3 above by clearing the flag after generating a
    completion event.

    When bio_split() is used on a bio, particularly in blk_queue_split(),
    there is an extra complication. A new bio is split off the front, and
    may be handle directly without going through generic_make_request().
    The old bio, which has been advanced, is passed to
    generic_make_request(), so it will trigger a trace event a second
    time.
    Probably the best result when a split happens is to see a single
    'queue' event for the whole bio, then multiple 'complete' events - one
    for each component. To achieve this was can:
    - copy the BIO_TRACE_COMPLETION flag to the new bio in bio_split()
    - avoid generating a 'queue' event if BIO_TRACE_COMPLETION is already set.
    This way, the split-off bio won't create a queue event, the original
    won't either even if it re-submitted to generic_make_request(),
    but both will produce completion events, each for their own range.

    So if generic_make_request() is called (which generates a QUEUED
    event), then bi_endio() will create a single COMPLETE event for each
    range that the bio is split into, unless the driver has explicitly
    requested it not to.

    Signed-off-by: NeilBrown
    Signed-off-by: Jens Axboe

    NeilBrown
     

28 Mar, 2017

1 commit

  • A cgroup gets assigned a low limit, but the cgroup could never dispatch
    enough IO to cross the low limit. In such case, the queue state machine
    will remain in LIMIT_LOW state and all other cgroups will be throttled
    according to low limit. This is unfair for other cgroups. We should
    treat the cgroup idle and upgrade the state machine to lower state.

    We also have a downgrade logic. If the state machine upgrades because of
    cgroup idle (real idle), the state machine will downgrade soon as the
    cgroup is below its low limit. This isn't what we want. A more
    complicated case is cgroup isn't idle when queue is in LIMIT_LOW. But
    when queue gets upgraded to lower state, other cgroups could dispatch
    more IO and this cgroup can't dispatch enough IO, so the cgroup is below
    its low limit and looks like idle (fake idle). In this case, the queue
    should downgrade soon. The key to determine if we should do downgrade is
    to detect if cgroup is truely idle.

    Unfortunately it's very hard to determine if a cgroup is real idle. This
    patch uses the 'think time check' idea from CFQ for the purpose. Please
    note, the idea doesn't work for all workloads. For example, a workload
    with io depth 8 has disk utilization 100%, hence think time is 0, eg,
    not idle. But the workload can run higher bandwidth with io depth 16.
    Compared to io depth 16, the io depth 8 workload is idle. We use the
    idea to roughly determine if a cgroup is idle.

    We treat a cgroup idle if its think time is above a threshold (by
    default 1ms for SSD and 100ms for HD). The idea is think time above the
    threshold will start to harm performance. HD is much slower so a longer
    think time is ok.

    The patch (and the latter patches) uses 'unsigned long' to track time.
    We convert 'ns' to 'us' with 'ns >> 10'. This is fast but loses
    precision, should not a big deal.

    Signed-off-by: Shaohua Li
    Signed-off-by: Jens Axboe

    Shaohua Li
     

26 Mar, 2017

1 commit

  • commit c18a1e0(block: introduce bio_clone_bioset_partial()) introduced
    bio_clone_bioset_partial() for raid1 write behind IO. Now the write behind is
    rewritten by Ming. We don't need the API any more, so revert the commit.

    Cc: Christoph Hellwig
    Reviewed-by: Jens Axboe
    Reviewed-by: Ming Lei
    Signed-off-by: Shaohua Li

    Shaohua Li
     

25 Mar, 2017

1 commit

  • Turns out we can use bio_copy_data in raid1's write behind,
    and we can make alloc_behind_pages() more clean/efficient,
    but we need to partial version of bio_copy_data().

    Signed-off-by: Ming Lei
    Reviewed-by: Jens Axboe
    Signed-off-by: Shaohua Li

    Ming Lei
     

23 Mar, 2017

1 commit


12 Mar, 2017

1 commit

  • Commit 79bd99596b73 ("blk: improve order of bio handling in generic_make_request()")
    changed current->bio_list so that it did not contain *all* of the
    queued bios, but only those submitted by the currently running
    make_request_fn.

    There are two places which walk the list and requeue selected bios,
    and others that check if the list is empty. These are no longer
    correct.

    So redefine current->bio_list to point to an array of two lists, which
    contain all queued bios, and adjust various code to test or walk both
    lists.

    Signed-off-by: NeilBrown
    Fixes: 79bd99596b73 ("blk: improve order of bio handling in generic_make_request()")
    Signed-off-by: Jens Axboe

    NeilBrown
     

25 Feb, 2017

1 commit

  • Pull md updates from Shaohua Li:
    "Mainly fixes bugs and improves performance:

    - Improve scalability for raid1 from Coly

    - Improve raid5-cache read performance, disk efficiency and IO
    pattern from Song and me

    - Fix a race condition of disk hotplug for linear from Coly

    - A few cleanup patches from Ming and Byungchul

    - Fix a memory leak from Neil

    - Fix WRITE SAME IO failure from me

    - Add doc for raid5-cache from me"

    * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md: (23 commits)
    md/raid1: fix write behind issues introduced by bio_clone_bioset_partial
    md/raid1: handle flush request correctly
    md/linear: shutup lockdep warnning
    md/raid1: fix a use-after-free bug
    RAID1: avoid unnecessary spin locks in I/O barrier code
    RAID1: a new I/O barrier implementation to remove resync window
    md/raid5: Don't reinvent the wheel but use existing llist API
    md: fast clone bio in bio_clone_mddev()
    md: remove unnecessary check on mddev
    md/raid1: use bio_clone_bioset_partial() in case of write behind
    md: fail if mddev->bio_set can't be created
    block: introduce bio_clone_bioset_partial()
    md: disable WRITE SAME if it fails in underlayer disks
    md/raid5-cache: exclude reclaiming stripes in reclaim check
    md/raid5-cache: stripe reclaim only counts valid stripes
    MD: add doc for raid5-cache
    Documentation: move MD related doc into a separate dir
    md: ensure md devices are freed before module is unloaded.
    md/r5cache: improve journal device efficiency
    md/r5cache: enable chunk_aligned_read with write back cache
    ...

    Linus Torvalds
     

18 Feb, 2017

1 commit


16 Feb, 2017

1 commit

  • md still need bio clone(not the fast version) for behind write,
    and it is more efficient to use bio_clone_bioset_partial().

    The idea is simple and just copy the bvecs range specified from
    parameters.

    Reviewed-by: Christoph Hellwig
    Reviewed-by: Jens Axboe
    Signed-off-by: Ming Lei
    Signed-off-by: Shaohua Li

    Ming Lei
     

02 Feb, 2017

1 commit


01 Feb, 2017

1 commit

  • Instead of keeping two levels of indirection for requests types, fold it
    all into the operations. The little caveat here is that previously
    cmd_type only applied to struct request, while the request and bio op
    fields were set to plain REQ_OP_READ/WRITE even for passthrough
    operations.

    Instead this patch adds new REQ_OP_* for SCSI passthrough and driver
    private requests, althought it has to add two for each so that we
    can communicate the data in/out nature of the request.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig