08 Jul, 2013

1 commit


24 Mar, 2013

12 commits

  • This was the only real user of BIO_CLONED, which didn't have very clear
    semantics. Convert to its own flag so we can get rid of BIO_CLONED.

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe
    CC: Martin K. Petersen

    Kent Overstreet
     
  • This is for the new bio splitting code. When we split a bio, if the
    split occured on a bvec boundry we reuse the bvec for the new bio. But
    that means bio_free() can't free it, hence the explicit flag.

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe
    Acked-by: Tejun Heo

    Kent Overstreet
     
  • More utility code to replace stuff that's getting open coded.

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe
    CC: NeilBrown

    Kent Overstreet
     
  • __bio_for_each_segment() iterates bvecs from the specified index
    instead of bio->bv_idx. Currently, the only usage is to walk all the
    bvecs after the bio has been advanced by specifying 0 index.

    For immutable bvecs, we need to split these apart;
    bio_for_each_segment() is going to have a different implementation.
    This will also help document the intent of code that's using it -
    bio_for_each_segment_all() is only legal to use for code that owns the
    bio.

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe
    CC: Neil Brown
    CC: Boaz Harrosh

    Kent Overstreet
     
  • This gets open coded quite a bit and it's tricky to get right, so make a
    generic version and convert some existing users over to it instead.

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe

    Kent Overstreet
     
  • Random cleanup - this code was duplicated and it's not really specific
    to md.

    Also added the ability to return the actual error code.

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe
    CC: NeilBrown
    Acked-by: Tejun Heo

    Kent Overstreet
     
  • Just a little convenience macro - main reason to add it now is preparing
    for immutable bio vecs, it'll reduce the size of the patch that puts
    bi_sector/bi_size/bi_idx into a struct bvec_iter.

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe
    CC: Lars Ellenberg
    CC: Jiri Kosina
    CC: Alasdair Kergon
    CC: dm-devel@redhat.com
    CC: Neil Brown
    CC: Martin Schwidefsky
    CC: Heiko Carstens
    CC: linux-s390@vger.kernel.org
    CC: Chris Mason
    CC: Steven Whitehouse
    Acked-by: Steven Whitehouse

    Kent Overstreet
     
  • This is prep work for immutable bio vecs; we first want to centralize
    where bvecs are modified.

    Next two patches convert some existing code to use this function.

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe

    Kent Overstreet
     
  • This adds a pointer to the bvec array to struct bio_integrity_payload,
    instead of the bvecs always being inline; then the bvecs are allocated
    with bvec_alloc_bs().

    Changed bvec_alloc_bs() and bvec_free_bs() to take a pointer to a
    mempool instead of the bioset, so that bio integrity can use a different
    mempool for its bvecs, and thus avoid a potential deadlock.

    This is eventually for immutable bio vecs - immutable bvecs aren't
    useful if we still have to copy them, hence the need for the pointer.
    Less code is always nice too, though.

    Also, bio_integrity_alloc() was using fs_bio_set if no bio_set was
    specified. This was wrong - using the bio_set doesn't protect us from
    memory allocation failures, because we just used kmalloc for the
    bio_integrity_payload. But it does introduce the possibility of
    deadlock, if for some reason we weren't supposed to be using fs_bio_set.

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe
    CC: Martin K. Petersen

    Kent Overstreet
     
  • bio_integrity_split() seemed to be confusing pointers and arrays -
    bip_vec in bio_integrity_payload was an array appended to the end of the
    payload, so the bio_vecs in struct bio_pair should have come after the
    bio_integrity_payload they're for.

    Fix it by making bip_vec a pointer to the inline vecs - a later patch is
    going to make more use of this pointer.

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe
    CC: Martin K. Petersen

    Kent Overstreet
     
  • Previously, if we ever try to allocate more than once from the same bio
    set while running under generic_make_request() (i.e. a stacking block
    driver), we risk deadlock.

    This is because of the code in generic_make_request() that converts
    recursion to iteration; any bios we submit won't actually be submitted
    (so they can complete and eventually be freed) until after we return -
    this means if we allocate a second bio, we're blocking the first one
    from ever being freed.

    Thus if enough threads call into a stacking block driver at the same
    time with bios that need multiple splits, and the bio_set's reserve gets
    used up, we deadlock.

    This can be worked around in the driver code - we could check if we're
    running under generic_make_request(), then mask out __GFP_WAIT when we
    go to allocate a bio, and if the allocation fails punt to workqueue and
    retry the allocation.

    But this is tricky and not a generic solution. This patch solves it for
    all users by inverting the previously described technique. We allocate a
    rescuer workqueue for each bio_set, and then in the allocation code if
    there are bios on current->bio_list we would be blocking, we punt them
    to the rescuer workqueue to be submitted.

    This guarantees forward progress for bio allocations under
    generic_make_request() provided each bio is submitted before allocating
    the next, and provided the bios are freed after they complete.

    Note that this doesn't do anything for allocation from other mempools.
    Instead of allocating per bio data structures from a mempool, code
    should use bio_set's front_pad.

    Tested it by forcing the rescue codepath to be taken (by disabling the
    first GFP_NOWAIT) attempt, and then ran it with bcache (which does a lot
    of arbitrary bio splitting) and verified that the rescuer was being
    invoked.

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe
    Acked-by: Tejun Heo
    Reviewed-by: Muthukumar Ratty

    Kent Overstreet
     
  • This is prep work for the next patch, which embeds a struct bio_list in
    struct bio_set.

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe

    Kent Overstreet
     

20 Sep, 2012

2 commits

  • The WRITE SAME command supported on some SCSI devices allows the same
    block to be efficiently replicated throughout a block range. Only a
    single logical block is transferred from the host and the storage device
    writes the same data to all blocks described by the I/O.

    This patch implements support for WRITE SAME in the block layer. The
    blkdev_issue_write_same() function can be used by filesystems and block
    drivers to replicate a buffer across a block range. This can be used to
    efficiently initialize software RAID devices, etc.

    Signed-off-by: Martin K. Petersen
    Acked-by: Mike Snitzer
    Signed-off-by: Jens Axboe

    Martin K. Petersen
     
  • Remove special-casing of non-rw fs style requests (discard). The nomerge
    flags are consolidated in blk_types.h, and rq_mergeable() and
    bio_mergeable() have been modified to use them.

    bio_is_rw() is used in place of bio_has_data() a few places. This is
    done to to distinguish true reads and writes from other fs type requests
    that carry a payload (e.g. write same).

    Signed-off-by: Martin K. Petersen
    Acked-by: Mike Snitzer
    Signed-off-by: Jens Axboe

    Martin K. Petersen
     

09 Sep, 2012

5 commits

  • Previously, there was bio_clone() but it only allocated from the fs bio
    set; as a result various users were open coding it and using
    __bio_clone().

    This changes bio_clone() to become bio_clone_bioset(), and then we add
    bio_clone() and bio_clone_kmalloc() as wrappers around it, making use of
    the functionality the last patch adedd.

    This will also help in a later patch changing how bio cloning works.

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe
    CC: NeilBrown
    CC: Alasdair Kergon
    CC: Boaz Harrosh
    CC: Jeff Garzik
    Acked-by: Jeff Garzik
    Signed-off-by: Jens Axboe

    Kent Overstreet
     
  • Previously, bio_kmalloc() and bio_alloc_bioset() behaved slightly
    different because there was some almost-duplicated code - this fixes
    some of that.

    The important change is that previously bio_kmalloc() always set
    bi_io_vec = bi_inline_vecs, even if nr_iovecs == 0 - unlike
    bio_alloc_bioset(). This would cause bio_has_data() to return true; I
    don't know if this resulted in any actual bugs but it was certainly
    wrong.

    bio_kmalloc() and bio_alloc_bioset() also have different arbitrary
    limits on nr_iovecs - 1024 (UIO_MAXIOV) for bio_kmalloc(), 256
    (BIO_MAX_PAGES) for bio_alloc_bioset(). This patch doesn't fix that, but
    at least they're enforced closer together and hopefully they will be
    fixed in a later patch.

    This'll also help with some future cleanups - there are a fair number of
    functions that allocate bios (e.g. bio_clone()), and now they don't have
    to be duplicated for bio_alloc(), bio_alloc_bioset(), and bio_kmalloc().

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe
    v7: Re-add dropped comments, improv patch description
    Signed-off-by: Jens Axboe

    Kent Overstreet
     
  • Now that we've got generic code for freeing bios allocated from bio
    pools, this isn't needed anymore.

    This patch also makes bio_free() static, since without bi_destructor
    there should be no need for it to be called anywhere else.

    bio_free() is now only called from bio_put, so we can refactor those a
    bit - move some code from bio_put() to bio_free() and kill the redundant
    bio->bi_next = NULL.

    v5: Switch to BIO_KMALLOC_POOL ((void *)~0), per Boaz
    v6: BIO_KMALLOC_POOL now NULL, drop bio_free's EXPORT_SYMBOL
    v7: No #define BIO_KMALLOC_POOL anymore

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe
    Signed-off-by: Jens Axboe

    Kent Overstreet
     
  • Reusing bios is something that's been highly frowned upon in the past,
    but driver code keeps doing it anyways. If it's going to happen anyways,
    we should provide a generic method.

    This'll help with getting rid of bi_destructor - drivers/block/pktcdvd.c
    was open coding it, by doing a bio_init() and resetting bi_destructor.

    This required reordering struct bio, but the block layer is not yet
    nearly fast enough for any cacheline effects to matter here.

    v5: Add a define BIO_RESET_BITS, to be very explicit about what parts of
    bio->bi_flags are saved.
    v6: Further commenting verbosity, per Tejun
    v9: Add a function comment

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe
    Acked-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Kent Overstreet
     
  • Now that bios keep track of where they were allocated from,
    bio_integrity_alloc_bioset() becomes redundant.

    Remove bio_integrity_alloc_bioset() and drop bio_set argument from the
    related functions and make them use bio->bi_pool.

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe
    CC: Martin K. Petersen
    Acked-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Kent Overstreet
     

02 Apr, 2012

1 commit

  • cgroup/for-3.5 contains the following changes which blk-cgroup needs
    to proceed with the on-going cleanup.

    * Dynamic addition and removal of cftypes to make config/stat file
    handling modular for policies.

    * cgroup removal update to not wait for css references to drain to fix
    blkcg removal hang caused by cfq caching cfqgs.

    Pull in cgroup/for-3.5 into block/for-3.5/core. This causes the
    following conflicts in block/blk-cgroup.c.

    * 761b3ef50e "cgroup: remove cgroup_subsys argument from callbacks"
    conflicts with blkiocg_pre_destroy() addition and blkiocg_attach()
    removal. Resolved by removing @subsys from all subsys methods.

    * 676f7c8f84 "cgroup: relocate cftype and cgroup_subsys definitions in
    controllers" conflicts with ->pre_destroy() and ->attach() updates
    and removal of modular config. Resolved by dropping forward
    declarations of the methods and applying updates to the relocated
    blkio_subsys.

    * 4baf6e3325 "cgroup: convert all non-memcg controllers to the new
    cftype interface" builds upon the previous item. Resolved by adding
    ->base_cftypes to the relocated blkio_subsys.

    Signed-off-by: Tejun Heo

    Tejun Heo
     

25 Mar, 2012

1 commit

  • Pull cleanup from Paul Gortmaker:
    "The changes shown here are to unify linux's BUG support under the one
    file. Due to historical reasons, we have some BUG code
    in bug.h and some in kernel.h -- i.e. the support for BUILD_BUG in
    linux/kernel.h predates the addition of linux/bug.h, but old code in
    kernel.h wasn't moved to bug.h at that time. As a band-aid, kernel.h
    was including to pseudo link them.

    This has caused confusion[1] and general yuck/WTF[2] reactions. Here
    is an example that violates the principle of least surprise:

    CC lib/string.o
    lib/string.c: In function 'strlcat':
    lib/string.c:225:2: error: implicit declaration of function 'BUILD_BUG_ON'
    make[2]: *** [lib/string.o] Error 1
    $
    $ grep linux/bug.h lib/string.c
    #include
    $

    We've included for the BUG infrastructure and yet we
    still get a compile fail! [We've not kernel.h for BUILD_BUG_ON.] Ugh -
    very confusing for someone who is new to kernel development.

    With the above in mind, the goals of this changeset are:

    1) find and fix any include/*.h files that were relying on the
    implicit presence of BUG code.
    2) find and fix any C files that were consuming kernel.h and hence
    relying on implicitly getting some/all BUG code.
    3) Move the BUG related code living in kernel.h to
    4) remove the asm/bug.h from kernel.h to finally break the chain.

    During development, the order was more like 3-4, build-test, 1-2. But
    to ensure that git history for bisect doesn't get needless build
    failures introduced, the commits have been reorderd to fix the problem
    areas in advance.

    [1] https://lkml.org/lkml/2012/1/3/90
    [2] https://lkml.org/lkml/2012/1/17/414"

    Fix up conflicts (new radeon file, reiserfs header cleanups) as per Paul
    and linux-next.

    * tag 'bug-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
    kernel.h: doesn't explicitly use bug.h, so don't include it.
    bug: consolidate BUILD_BUG_ON with other bug code
    BUG: headers with BUG/BUG_ON etc. need linux/bug.h
    bug.h: add include of it to various implicit C users
    lib: fix implicit users of kernel.h for TAINT_WARN
    spinlock: macroize assert_spin_locked to avoid bug.h dependency
    x86: relocate get/set debugreg fcns to include/asm/debugreg.

    Linus Torvalds
     

20 Mar, 2012

1 commit


07 Mar, 2012

1 commit

  • IO scheduling and cgroup are tied to the issuing task via io_context
    and cgroup of %current. Unfortunately, there are cases where IOs need
    to be routed via a different task which makes scheduling and cgroup
    limit enforcement applied completely incorrectly.

    For example, all bios delayed by blk-throttle end up being issued by a
    delayed work item and get assigned the io_context of the worker task
    which happens to serve the work item and dumped to the default block
    cgroup. This is double confusing as bios which aren't delayed end up
    in the correct cgroup and makes using blk-throttle and cfq propio
    together impossible.

    Any code which punts IO issuing to another task is affected which is
    getting more and more common (e.g. btrfs). As both io_context and
    cgroup are firmly tied to task including userland visible APIs to
    manipulate them, it makes a lot of sense to match up tasks to bios.

    This patch implements bio_associate_current() which associates the
    specified bio with %current. The bio will record the associated ioc
    and blkcg at that point and block layer will use the recorded ones
    regardless of which task actually ends up issuing the bio. bio
    release puts the associated ioc and blkcg.

    It grabs and remembers ioc and blkcg instead of the task itself
    because task may already be dead by the time the bio is issued making
    ioc and blkcg inaccessible and those are all block layer cares about.

    elevator_set_req_fn() is updated such that the bio elvdata is being
    allocated for is available to the elevator.

    This doesn't update block cgroup policies yet. Further patches will
    implement the support.

    -v2: #ifdef CONFIG_BLK_CGROUP added around bio->bi_ioc dereference in
    rq_ioc() to fix build breakage.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Cc: Kent Overstreet
    Signed-off-by: Jens Axboe

    Tejun Heo
     

05 Mar, 2012

1 commit

  • If a header file is making use of BUG, BUG_ON, BUILD_BUG_ON, or any
    other BUG variant in a static inline (i.e. not in a #define) then
    that header really should be including and not just
    expecting it to be implicitly present.

    We can make this change risk-free, since if the files using these
    headers didn't have exposure to linux/bug.h already, they would have
    been causing compile failures/warnings.

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     

13 Jan, 2012

1 commit


16 Nov, 2011

2 commits

  • This is just a cleanup patch to silence a static checker warning.

    The problem is that we cap "nr_iovecs" so it can't be larger than
    "UIO_MAXIOV" but we don't check for negative values. It turns out this is
    prevented at other layers, but logically it doesn't make sense to have
    negative nr_iovecs so making it unsigned is nicer.

    Signed-off-by: Dan Carpenter
    Signed-off-by: Andrew Morton
    Signed-off-by: Jens Axboe

    Dan Carpenter
     
  • When CONFIG_BLK_DEV_INTEGRITY is not set, we get these warnings:

    drivers/md/dm.c: In function 'split_bvec':
    drivers/md/dm.c:1061:3: warning: statement with no effect
    drivers/md/dm.c: In function 'clone_bio':
    drivers/md/dm.c:1088:3: warning: statement with no effect

    Signed-off-by: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Jens Axboe

    Stephen Rothwell
     

24 Oct, 2011

1 commit

  • bio originally has the functionality to set the complete cpu, but
    it is broken.

    Chirstoph said that "This code is unused, and from the all the
    discussions lately pretty obviously broken. The only thing keeping
    it serves is creating more confusion and possibly more bugs."

    And Jens replied with "We can kill bio_set_completion_cpu(). I'm fine
    with leaving cpu control to the request based drivers, they are the
    only ones that can toggle the setting anyway".

    So this patch tries to remove all the work of controling complete cpu
    from a bio.

    Cc: Shaohua Li
    Cc: Christoph Hellwig
    Signed-off-by: Tao Ma
    Signed-off-by: Jens Axboe

    Tao Ma
     

08 Mar, 2011

1 commit


10 Nov, 2010

1 commit

  • REQ_HARDBARRIER is dead now, so remove the leftovers. What's left
    at this point is:

    - various checks inside the block layer.
    - sanity checks in bio based drivers.
    - now unused bio_empty_barrier helper.
    - Xen blockfront use of BLKIF_OP_WRITE_BARRIER - it's dead for a while,
    but Xen really needs to sort out it's barrier situaton.
    - setting of ordered tags in uas - dead code copied from old scsi
    drivers.
    - scsi different retry for barriers - it's dead and should have been
    removed when flushes were converted to FS requests.
    - blktrace handling of barriers - removed. Someone who knows blktrace
    better should add support for REQ_FLUSH and REQ_FUA, though.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

21 Oct, 2010

1 commit


11 Sep, 2010

1 commit

  • Some controllers have a hardware limit on the number of protection
    information scatter-gather list segments they can handle.

    Introduce a max_integrity_segments limit in the block layer and provide
    a new scsi_host_template setting that allows HBA drivers to provide a
    value suitable for the hardware.

    Add support for honoring the integrity segment limit when merging both
    bios and requests.

    Signed-off-by: Martin K. Petersen
    Signed-off-by: Jens Axboe

    Martin K. Petersen
     

08 Aug, 2010

3 commits

  • linux/fs.h hard coded READ/WRITE constants which should match BIO_RW_*
    flags. This is fragile and caused breakage during BIO_RW_* flag
    rearrangement. The hardcoding is to avoid include dependency hell.

    Create linux/bio_types.h which contatins definitions for bio data
    structures and flags and include it from bio.h and fs.h, and make fs.h
    define all READ/WRITE related constants in terms of BIO_RW_* flags.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • SCSI-ml needs a way to mark a request as flush request in
    q->prepare_flush_fn because it needs to identify them later (e.g. in
    q->request_fn or prep_rq_fn).

    queue_flush sets REQ_HARDBARRIER in rq->cmd_flags however the block
    layer also sends normal REQ_TYPE_FS requests with REQ_HARDBARRIER. So
    SCSI-ml can't use REQ_HARDBARRIER to identify flush requests.

    We could change the block layer to clear REQ_HARDBARRIER bit before
    sending non flush requests to the lower layers. However, intorudcing
    the new flag looks cleaner (surely easier).

    Signed-off-by: FUJITA Tomonori
    Cc: James Bottomley
    Cc: David S. Miller
    Cc: Rusty Russell
    Cc: Alasdair G Kergon
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    FUJITA Tomonori
     
  • Remove the current bio flags and reuse the request flags for the bio, too.
    This allows to more easily trace the type of I/O from the filesystem
    down to the block driver. There were two flags in the bio that were
    missing in the requests: BIO_RW_UNPLUG and BIO_RW_AHEAD. Also I've
    renamed two request flags that had a superflous RW in them.

    Note that the flags are in bio.h despite having the REQ_ name - as
    blkdev.h includes bio.h that is the only way to go for now.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

26 Nov, 2009

1 commit

  • Mtdblock driver doesn't call flush_dcache_page for pages in request. So,
    this causes problems on architectures where the icache doesn't fill from
    the dcache or with dcache aliases. The patch fixes this.

    The ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE symbol was introduced to avoid
    pointless empty cache-thrashing loops on architectures for which
    flush_dcache_page() is a no-op. Every architecture was provided with this
    flush pages on architectires where ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE is
    equal 1 or do nothing otherwise.

    See "fix mtd_blkdevs problem with caches on some architectures" discussion
    on LKML for more information.

    Signed-off-by: Ilya Loginov
    Cc: Ingo Molnar
    Cc: David Woodhouse
    Cc: Peter Horton
    Cc: "Ed L. Cashin"
    Signed-off-by: Jens Axboe

    Ilya Loginov
     

02 Nov, 2009

1 commit


11 Sep, 2009

2 commits