14 Nov, 2018

2 commits

  • commit 34ffec60b27aa81d04e274e71e4c6ef740f75fc7 upstream.

    Obviously the created writesame bio has to be aligned with logical block
    size, and use bio_allowed_max_sectors() to retrieve this number.

    Cc: stable@vger.kernel.org
    Cc: Mike Snitzer
    Cc: Christoph Hellwig
    Cc: Xiao Ni
    Cc: Mariusz Dabrowski
    Fixes: b49a0871be31a745b2ef ("block: remove split code in blkdev_issue_{discard,write_same}")
    Tested-by: Rui Salvaterra
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Ming Lei
     
  • commit 1adfc5e4136f5967d591c399aff95b3b035f16b7 upstream.

    Obviously the created discard bio has to be aligned with logical block size.

    This patch introduces the helper of bio_allowed_max_sectors() for
    this purpose.

    Cc: stable@vger.kernel.org
    Cc: Mike Snitzer
    Cc: Christoph Hellwig
    Cc: Xiao Ni
    Cc: Mariusz Dabrowski
    Fixes: 744889b7cbb56a6 ("block: don't deal with discard limit in blkdev_issue_discard()")
    Fixes: a22c4d7e34402cc ("block: re-add discard_granularity and alignment checks")
    Reported-by: Rui Salvaterra
    Tested-by: Rui Salvaterra
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Ming Lei
     

18 Oct, 2018

1 commit

  • blk_queue_split() does respect this limit via bio splitting, so no
    need to do that in blkdev_issue_discard(), then we can align to
    normal bio submit(bio_add_page() & submit_bio()).

    More importantly, this patch fixes one issue introduced in a22c4d7e34402cc
    ("block: re-add discard_granularity and alignment checks"), in which
    zero discard bio may be generated in case of zero alignment.

    Fixes: a22c4d7e34402ccdf3 ("block: re-add discard_granularity and alignment checks")
    Cc: stable@vger.kernel.org
    Cc: Ming Lin
    Cc: Mike Snitzer
    Cc: Christoph Hellwig
    Cc: Xiao Ni
    Tested-by: Mariusz Dabrowski
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     

09 Jul, 2018

1 commit

  • If __blkdev_issue_discard is in progress and a device mapper device is
    reloaded with a table that doesn't support discard,
    q->limits.max_discard_sectors is set to zero. This results in infinite
    loop in __blkdev_issue_discard.

    This patch checks if max_discard_sectors is zero and aborts with
    -EOPNOTSUPP.

    Signed-off-by: Mikulas Patocka
    Tested-by: Zdenek Kabelac
    Cc: stable@vger.kernel.org
    Signed-off-by: Jens Axboe

    Mikulas Patocka
     

09 May, 2018

1 commit


19 Jan, 2018

1 commit

  • Similar to blkdev_write_iter(), return -EPERM if the partition is
    read-only. This covers ioctl(), fallocate() and most in-kernel users
    but isn't meant to be exhaustive -- everything else will be caught in
    generic_make_request_checks(), fail with -EIO and can be fixed later.

    Reviewed-by: Sagi Grimberg
    Signed-off-by: Ilya Dryomov
    Signed-off-by: Jens Axboe

    Ilya Dryomov
     

15 Nov, 2017

1 commit

  • Pull core block layer updates from Jens Axboe:
    "This is the main pull request for block storage for 4.15-rc1.

    Nothing out of the ordinary in here, and no API changes or anything
    like that. Just various new features for drivers, core changes, etc.
    In particular, this pull request contains:

    - A patch series from Bart, closing the whole on blk/scsi-mq queue
    quescing.

    - A series from Christoph, building towards hidden gendisks (for
    multipath) and ability to move bio chains around.

    - NVMe
    - Support for native multipath for NVMe (Christoph).
    - Userspace notifications for AENs (Keith).
    - Command side-effects support (Keith).
    - SGL support (Chaitanya Kulkarni)
    - FC fixes and improvements (James Smart)
    - Lots of fixes and tweaks (Various)

    - bcache
    - New maintainer (Michael Lyle)
    - Writeback control improvements (Michael)
    - Various fixes (Coly, Elena, Eric, Liang, et al)

    - lightnvm updates, mostly centered around the pblk interface
    (Javier, Hans, and Rakesh).

    - Removal of unused bio/bvec kmap atomic interfaces (me, Christoph)

    - Writeback series that fix the much discussed hundreds of millions
    of sync-all units. This goes all the way, as discussed previously
    (me).

    - Fix for missing wakeup on writeback timer adjustments (Yafang
    Shao).

    - Fix laptop mode on blk-mq (me).

    - {mq,name} tupple lookup for IO schedulers, allowing us to have
    alias names. This means you can use 'deadline' on both !mq and on
    mq (where it's called mq-deadline). (me).

    - blktrace race fix, oopsing on sg load (me).

    - blk-mq optimizations (me).

    - Obscure waitqueue race fix for kyber (Omar).

    - NBD fixes (Josef).

    - Disable writeback throttling by default on bfq, like we do on cfq
    (Luca Miccio).

    - Series from Ming that enable us to treat flush requests on blk-mq
    like any other request. This is a really nice cleanup.

    - Series from Ming that improves merging on blk-mq with schedulers,
    getting us closer to flipping the switch on scsi-mq again.

    - BFQ updates (Paolo).

    - blk-mq atomic flags memory ordering fixes (Peter Z).

    - Loop cgroup support (Shaohua).

    - Lots of minor fixes from lots of different folks, both for core and
    driver code"

    * 'for-4.15/block' of git://git.kernel.dk/linux-block: (294 commits)
    nvme: fix visibility of "uuid" ns attribute
    blk-mq: fixup some comment typos and lengths
    ide: ide-atapi: fix compile error with defining macro DEBUG
    blk-mq: improve tag waiting setup for non-shared tags
    brd: remove unused brd_mutex
    blk-mq: only run the hardware queue if IO is pending
    block: avoid null pointer dereference on null disk
    fs: guard_bio_eod() needs to consider partitions
    xtensa/simdisk: fix compile error
    nvme: expose subsys attribute to sysfs
    nvme: create 'slaves' and 'holders' entries for hidden controllers
    block: create 'slaves' and 'holders' entries for hidden gendisks
    nvme: also expose the namespace identification sysfs files for mpath nodes
    nvme: implement multipath access to nvme subsystems
    nvme: track shared namespaces
    nvme: introduce a nvme_ns_ids structure
    nvme: track subsystems
    block, nvme: Introduce blk_mq_req_flags_t
    block, scsi: Make SCSI quiesce and resume work reliably
    block: Add the QUEUE_FLAG_PREEMPT_ONLY request queue flag
    ...

    Linus Torvalds
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

26 Oct, 2017

2 commits

  • sd_config_write_same() ignores ->max_ws_blocks == 0 and resets it to
    permit trying WRITE SAME on older SCSI devices, unless ->no_write_same
    is set. Because REQ_OP_WRITE_ZEROES is implemented in terms of WRITE
    SAME, blkdev_issue_zeroout() may fail with -EREMOTEIO:

    $ fallocate -zn -l 1k /dev/sdg
    fallocate: fallocate failed: Remote I/O error
    $ fallocate -zn -l 1k /dev/sdg # OK
    $ fallocate -zn -l 1k /dev/sdg # OK

    The following calls succeed because sd_done() sets ->no_write_same in
    response to a sense that would become BLK_STS_TARGET/-EREMOTEIO, causing
    __blkdev_issue_zeroout() to fall back to generating ZERO_PAGE bios.

    This means blkdev_issue_zeroout() must cope with WRITE ZEROES failing
    and fall back to manually zeroing, unless BLKDEV_ZERO_NOFALLBACK is
    specified. For BLKDEV_ZERO_NOFALLBACK case, return -EOPNOTSUPP if
    sd_done() has just set ->no_write_same thus indicating lack of offload
    support.

    Fixes: c20cfc27a473 ("block: stop using blkdev_issue_write_same for zeroing")
    Cc: Hannes Reinecke
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Martin K. Petersen
    Signed-off-by: Ilya Dryomov
    Signed-off-by: Jens Axboe

    Ilya Dryomov
     
  • blkdev_issue_zeroout() will use this in !BLKDEV_ZERO_NOFALLBACK case.

    Reviewed-by: Christoph Hellwig
    Reviewed-by: Martin K. Petersen
    Signed-off-by: Ilya Dryomov
    Signed-off-by: Jens Axboe

    Ilya Dryomov
     

11 Sep, 2017

1 commit


24 Aug, 2017

1 commit

  • This way we don't need a block_device structure to submit I/O. The
    block_device has different life time rules from the gendisk and
    request_queue and is usually only available when the block device node
    is open. Other callers need to explicitly create one (e.g. the lightnvm
    passthrough code, or the new nvme multipathing code).

    For the actual I/O path all that we need is the gendisk, which exists
    once per block device. But given that the block layer also does
    partition remapping we additionally need a partition index, which is
    used for said remapping in generic_make_request.

    Note that all the block drivers generally want request_queue or
    sometimes the gendisk, so this removes a layer of indirection all
    over the stack.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

06 Jul, 2017

1 commit

  • The BIO issuing loop in __blkdev_issue_zeroout() is allocating BIOs
    with a maximum number of bvec (pages) equal to

    min(nr_sects, (sector_t)BIO_MAX_PAGES)

    This works since the requested number of bvecs will always be limited
    to the absolute maximum number supported (BIO_MAX_PAGES), but this is
    ineficient as too many bvec entries may be requested due to the
    different units being used in the min() operation (number of sectors vs
    number of pages).
    To fix this, introduce the helper __blkdev_sectors_to_bio_pages() to
    correctly calculate the number of bvecs for zeroout BIOs as the issuing
    loop progresses. The calculation is done using consistent units and
    makes sure that the number of pages return is at least 1 (for cases
    where the number of sectors is less that the number of sectors in
    a page).

    Also remove a trailing space after the bit shift in the internal loop
    min() call.

    Signed-off-by: Damien Le Moal
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Damien Le Moal
     

09 Apr, 2017

6 commits


25 Mar, 2017

1 commit


07 Feb, 2017

1 commit

  • Write Same can return an error asynchronously if it turns out the
    underlying SCSI device does not support Write Same, which makes a
    proper fallback to other methods in __blkdev_issue_zeroout impossible.
    Thus only issue a Write Same from blkdev_issue_zeroout an don't try it
    at all from __blkdev_issue_zeroout as a non-invasive workaround.

    Signed-off-by: Christoph Hellwig
    Reported-by: Junichi Nomura
    Fixes: e73c23ff ("block: add async variant of blkdev_issue_zeroout")
    Tested-by: Junichi Nomura
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

14 Jan, 2017

1 commit

  • Discard can return -EIO asynchronously if the alignment for the request
    isn't suitable for the driver, which makes a proper fallback to other
    methods in __blkdev_issue_zeroout impossible. Thus only issue a sync
    discard from blkdev_issue_zeroout an don't try discard at all from
    __blkdev_issue_zeroout as a non-invasive workaround.

    One more reason why abusing discard for zeroing must die..

    Signed-off-by: Christoph Hellwig
    Reported-by: Eryu Guan
    Fixes: e73c23ff ("block: add async variant of blkdev_issue_zeroout")
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

09 Dec, 2016

1 commit

  • Instead of allocating a single unused biovec for discard requests, send
    them down without any payload. Instead we allow the driver to add a
    "special" payload using a biovec embedded into struct request (unioned
    over other fields never used while in the driver), and overloading
    the number of segments for this case.

    This has a couple of advantages:

    - we don't have to allocate the bio_vec
    - the amount of special casing for discard requests in the block
    layer is significantly reduced
    - using this same scheme for other request types is trivial,
    which will be important for implementing the new WRITE_ZEROES
    op on devices where it actually requires a payload (e.g. SCSI)
    - we can get rid of playing games with the request length, as
    we'll never touch it and completions will work just fine
    - it will allow us to support ranged discard operations in the
    future by merging non-contiguous discard bios into a single
    request
    - last but not least it removes a lot of code

    This patch is the common base for my WIP series for ranges discards and to
    remove discard_zeroes_data in favor of always using REQ_OP_WRITE_ZEROES,
    so it would be good to get it in quickly.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

01 Dec, 2016

2 commits

  • This adds a new block layer operation to zero out a range of
    LBAs. This allows to implement zeroing for devices that don't use
    either discard with a predictable zero pattern or WRITE SAME of zeroes.
    The prominent example of that is NVMe with the Write Zeroes command,
    but in the future, this should also help with improving the way
    zeroing discards work. For this operation, suitable entry is exported in
    sysfs which indicate the number of maximum bytes allowed in one
    write zeroes operation by the device.

    Signed-off-by: Chaitanya Kulkarni
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Chaitanya Kulkarni
     
  • Similar to __blkdev_issue_discard this variant allows submitting
    the final bio asynchronously and chaining multiple ranges
    into a single completion.

    Signed-off-by: Chaitanya Kulkarni
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Chaitanya Kulkarni
     

28 Oct, 2016

1 commit

  • Now that we don't need the common flags to overflow outside the range
    of a 32-bit type we can encode them the same way for both the bio and
    request fields. This in addition allows us to place the operation
    first (and make some room for more ops while we're at it) and to
    stop having to shift around the operation values.

    In addition this allows passing around only one value in the block layer
    instead of two (and eventuall also in the file systems, but we can do
    that later) and thus clean up a lot of code.

    Last but not least this allows decreasing the size of the cmd_flags
    field in struct request to 32-bits. Various functions passing this
    value could also be updated, but I'd like to avoid the churn for now.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

12 Oct, 2016

1 commit

  • Make sure that the offset and length arguments that we're using to
    construct WRITE SAME and DISCARD requests are actually aligned to the
    logical block size. Failure to do this causes other errors in other parts
    of the block layer or the SCSI layer because disks don't support partial
    logical block writes.

    Link: http://lkml.kernel.org/r/147518379026.22791.4437508871355153928.stgit@birch.djwong.org
    Signed-off-by: Darrick J. Wong
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Bart Van Assche
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Hannes Reinecke
    Cc: Theodore Ts'o
    Cc: Mike Snitzer # tweaked header
    Cc: Brian Foster
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Darrick J. Wong
     

27 Jul, 2016

2 commits

  • Pull block driver updates from Jens Axboe:
    "This branch also contains core changes. I've come to the conclusion
    that from 4.9 and forward, I'll be doing just a single branch. We
    often have dependencies between core and drivers, and it's hard to
    always split them up appropriately without pulling core into drivers
    when that happens.

    That said, this contains:

    - separate secure erase type for the core block layer, from
    Christoph.

    - set of discard fixes, from Christoph.

    - bio shrinking fixes from Christoph, as a followup up to the
    op/flags change in the core branch.

    - map and append request fixes from Christoph.

    - NVMeF (NVMe over Fabrics) code from Christoph. This is pretty
    exciting!

    - nvme-loop fixes from Arnd.

    - removal of ->driverfs_dev from Dan, after providing a
    device_add_disk() helper.

    - bcache fixes from Bhaktipriya and Yijing.

    - cdrom subchannel read fix from Vchannaiah.

    - set of lightnvm updates from Wenwei, Matias, Johannes, and Javier.

    - set of drbd updates and fixes from Fabian, Lars, and Philipp.

    - mg_disk error path fix from Bart.

    - user notification for failed device add for loop, from Minfei.

    - NVMe in general:
    + NVMe delay quirk from Guilherme.
    + SR-IOV support and command retry limits from Keith.
    + fix for memory-less NUMA node from Masayoshi.
    + use UINT_MAX for discard sectors, from Minfei.
    + cancel IO fixes from Ming.
    + don't allocate unused major, from Neil.
    + error code fixup from Dan.
    + use constants for PSDT/FUSE from James.
    + variable init fix from Jay.
    + fabrics fixes from Ming, Sagi, and Wei.
    + various fixes"

    * 'for-4.8/drivers' of git://git.kernel.dk/linux-block: (115 commits)
    nvme/pci: Provide SR-IOV support
    nvme: initialize variable before logical OR'ing it
    block: unexport various bio mapping helpers
    scsi/osd: open code blk_make_request
    target: stop using blk_make_request
    block: simplify and export blk_rq_append_bio
    block: ensure bios return from blk_get_request are properly initialized
    virtio_blk: use blk_rq_map_kern
    memstick: don't allow REQ_TYPE_BLOCK_PC requests
    block: shrink bio size again
    block: simplify and cleanup bvec pool handling
    block: get rid of bio_rw and READA
    block: don't ignore -EOPNOTSUPP blkdev_issue_write_same
    block: introduce BLKDEV_DISCARD_ZERO to fix zeroout
    NVMe: don't allocate unused nvme_major
    nvme: avoid crashes when node 0 is memoryless node.
    nvme: Limit command retries
    loop: Make user notify for adding loop device failed
    nvme-loop: fix nvme-loop Kconfig dependencies
    nvmet: fix return value check in nvmet_subsys_alloc()
    ...

    Linus Torvalds
     
  • Pull core block updates from Jens Axboe:

    - the big change is the cleanup from Mike Christie, cleaning up our
    uses of command types and modified flags. This is what will throw
    some merge conflicts

    - regression fix for the above for btrfs, from Vincent

    - following up to the above, better packing of struct request from
    Christoph

    - a 2038 fix for blktrace from Arnd

    - a few trivial/spelling fixes from Bart Van Assche

    - a front merge check fix from Damien, which could cause issues on
    SMR drives

    - Atari partition fix from Gabriel

    - convert cfq to highres timers, since jiffies isn't granular enough
    for some devices these days. From Jan and Jeff

    - CFQ priority boost fix idle classes, from me

    - cleanup series from Ming, improving our bio/bvec iteration

    - a direct issue fix for blk-mq from Omar

    - fix for plug merging not involving the IO scheduler, like we do for
    other types of merges. From Tahsin

    - expose DAX type internally and through sysfs. From Toshi and Yigal

    * 'for-4.8/core' of git://git.kernel.dk/linux-block: (76 commits)
    block: Fix front merge check
    block: do not merge requests without consulting with io scheduler
    block: Fix spelling in a source code comment
    block: expose QUEUE_FLAG_DAX in sysfs
    block: add QUEUE_FLAG_DAX for devices to advertise their DAX support
    Btrfs: fix comparison in __btrfs_map_block()
    block: atari: Return early for unsupported sector size
    Doc: block: Fix a typo in queue-sysfs.txt
    cfq-iosched: Charge at least 1 jiffie instead of 1 ns
    cfq-iosched: Fix regression in bonnie++ rewrite performance
    cfq-iosched: Convert slice_resid from u64 to s64
    block: Convert fifo_time from ulong to u64
    blktrace: avoid using timespec
    block/blk-cgroup.c: Declare local symbols static
    block/bio-integrity.c: Add #include "blk.h"
    block/partition-generic.c: Remove a set-but-not-used variable
    block: bio: kill BIO_MAX_SIZE
    cfq-iosched: temporarily boost queue priority for idle classes
    block: drbd: avoid to use BIO_MAX_SIZE
    block: bio: remove BIO_MAX_SECTORS
    ...

    Linus Torvalds
     

21 Jul, 2016

2 commits

  • WRITE SAME is a data integrity operation and we can't simply ignore
    errors.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Martin K. Petersen
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • Currently blkdev_issue_zeroout cascades down from discards (if the driver
    guarantees that discards zero data), to WRITE SAME and then to a loop
    writing zeroes. Unfortunately we ignore run-time EOPNOTSUPP errors in the
    block layer blkdev_issue_discard helper to work around DM volumes that
    may have mixed discard support underneath.

    This patch intoroduces a new BLKDEV_DISCARD_ZERO flag to
    blkdev_issue_discard that indicates we are called for zeroing operation.
    This allows both to ignore the EOPNOTSUPP hack and actually consolidating
    the discard_zeroes_data check into the function.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Martin K. Petersen
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

09 Jun, 2016

1 commit


08 Jun, 2016

4 commits

  • This converts the block issue discard helper and users to use
    the bio_set_op_attrs accessor and only pass in the operation flags
    like REQ_SEQURE.

    Signed-off-by: Mike Christie
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Mike Christie
     
  • This patch converts the simple bi_rw use cases in the block,
    drivers, mm and fs code to set/get the bio operation using
    bio_set_op_attrs/bio_op

    These should be simple one or two liner cases, so I just did them
    in one patch. The next patches handle the more complicated
    cases in a module per patch.

    Signed-off-by: Mike Christie
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Mike Christie
     
  • This has callers of submit_bio/submit_bio_wait set the bio->bi_rw
    instead of passing it in. This makes that use the same as
    generic_make_request and how we set the other bio fields.

    Signed-off-by: Mike Christie

    Fixed up fs/ext4/crypto.c

    Signed-off-by: Jens Axboe

    Mike Christie
     
  • submit_bio_wait() gives the caller an opportunity to examine
    struct bio and so expects the caller to issue the put_bio()

    This fixes a memory leak reported by a few people in 4.7-rc2
    kmemleak report after 9082e87bfbf8 ("block: remove struct bio_batch")

    Signed-off-by: Shaun Tancheff
    Tested-by: Catalin Marinas
    Tested-by: Larry Finger@lwfinger.net
    Tested-by: David Drysdale
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Shaun Tancheff
     

06 May, 2016

1 commit

  • Commit 38f25255330 ("block: add __blkdev_issue_discard") incorrectly
    disallowed the early return of -EOPNOTSUPP if the device doesn't support
    discard (or secure discard). This early return of -EOPNOTSUPP has
    always been part of blkdev_issue_discard() interface so there isn't a
    good reason to break that behaviour -- especially when it can be easily
    reinstated.

    The nuance of allowing early return of -EOPNOTSUPP vs disallowing late
    return of -EOPNOTSUPP is: if the overall device never advertised support
    for discards and one is issued to the device it is beneficial to inform
    the caller that discards are not supported via -EOPNOTSUPP. But if a
    device advertises discard support it means that at least a subset of the
    device does have discard support -- but it could be that discards issued
    to some regions of a stacked device will not be supported. In that case
    the late return of -EOPNOTSUPP must be disallowed.

    Fixes: 38f25255330 ("block: add __blkdev_issue_discard")
    Signed-off-by: Mike Snitzer
    Signed-off-by: Jens Axboe

    Mike Snitzer
     

02 May, 2016

2 commits

  • This is a version of blkdev_issue_discard which doesn't wait for
    the I/O to complete, but instead allows the caller to submit
    the final bio and/or chain it to others.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Ming Lin
    Signed-off-by: Sagi Grimberg
    Reviewed-by: Ming Lei
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • It can be replaced with a combination of bio_chain and submit_bio_wait.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Ming Lin
    Signed-off-by: Sagi Grimberg
    Reviewed-by: Ming Lei
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

28 Oct, 2015

1 commit

  • In commit b49a087("block: remove split code in
    blkdev_issue_{discard,write_same}"), discard_granularity and alignment
    checks were removed. Ideally, with bio late splitting, the upper layers
    shouldn't need to depend on device's limits.

    Christoph reported a discard regression on the HGST Ultrastar SN100 NVMe
    device when mkfs.xfs. We have not found the root cause yet.

    This patch re-adds discard_granularity and alignment checks by reverting
    the related changes in commit b49a087. The good thing is now we can
    remove the 2G discard size cap and just use UINT_MAX to avoid bi_size
    overflow.

    Reviewed-by: Christoph Hellwig
    Tested-by: Christoph Hellwig
    Signed-off-by: Ming Lin
    Reviewed-by: Mike Snitzer
    Signed-off-by: Jens Axboe

    Ming Lin