05 Feb, 2016

2 commits

  • James Bottomley
     
  • When a storage device rejects a WRITE SAME command we will disable write
    same functionality for the device and return -EREMOTEIO to the block
    layer. -EREMOTEIO will in turn prevent DM from retrying the I/O and/or
    failing the path.

    Yiwen Jiang discovered a small race where WRITE SAME requests issued
    simultaneously would cause -EIO to be returned. This happened because
    any requests being prepared after WRITE SAME had been disabled for the
    device caused us to return BLKPREP_KILL. The latter caused the block
    layer to return -EIO upon completion.

    To overcome this we introduce BLKPREP_INVALID which indicates that this
    is an invalid request for the device. blk_peek_request() is modified to
    return -EREMOTEIO in that case.

    Reported-by: Yiwen Jiang
    Suggested-by: Mike Snitzer
    Reviewed-by: Hannes Reinicke
    Reviewed-by: Ewan Milne
    Reviewed-by: Yiwen Jiang
    Signed-off-by: Martin K. Petersen

    Martin K. Petersen
     

27 Jan, 2016

2 commits

  • James Bottomley
     
  • Runtime suspend during driver probe and removal can cause problems.
    The driver's runtime_suspend or runtime_resume callbacks may invoked
    before the driver has finished binding to the device or after the
    driver has unbound from the device.

    This problem shows up with the sd and sr drivers, and can cause disk
    or CD/DVD drives to become unusable as a result. The fix is simple.
    The drivers store a pointer to the scsi_disk or scsi_cd structure as
    their private device data when probing is finished, so we simply have
    to be sure to clear the private data during removal and test it during
    runtime suspend/resume.

    This fixes .

    Signed-off-by: Alan Stern
    Reported-by: Paul Menzel
    Reported-by: Erich Schubert
    Reported-by: Alexandre Rossi
    Tested-by: Paul Menzel
    Tested-by: Erich Schubert
    CC:
    Signed-off-by: James Bottomley

    Alan Stern
     

21 Jan, 2016

1 commit

  • Commit ca369d51b3e1 ("block/sd: Fix device-imposed transfer length
    limits") accidentally switched optimal I/O size reporting from bytes to
    block layer sectors.

    Signed-off-by: Martin K. Petersen
    Reported-by: Christian Borntraeger
    Tested-by: Christian Borntraeger
    Fixes: ca369d51b3e1649be4a72addd6d6a168cfb3f537
    Cc: stable@vger.kernel.org # 4.4+
    Reviewed-by: James E.J. Bottomley
    Reviewed-by: Ewan D. Milne
    Reviewed-by: Matthew R. Ochs

    Martin K. Petersen
     

22 Dec, 2015

1 commit

  • Eryu Guan reported that loading scsi_debug would fail. This turned out
    to be caused by scsi_debug reporting an optimal I/O size of 32KB which
    is smaller than the 64KB page size on the PowerPC system in question.

    Add a check to ensure that we only use the device-reported OPTIMAL
    TRANSFER LENGTH if it is bigger than or equal to the page cache size.

    Reported-by: Eryu Guan
    Reported-by: Ming Lei
    Reviewed-by: Douglas Gilbert
    Reviewed-by: Ewan Milne
    Signed-off-by: Martin K. Petersen

    Martin K. Petersen
     

04 Dec, 2015

1 commit


26 Nov, 2015

2 commits

  • Commit 4f258a46346c ("sd: Fix maximum I/O size for BLOCK_PC requests")
    had the unfortunate side-effect of removing an implicit clamp to
    BLK_DEF_MAX_SECTORS for REQ_TYPE_FS requests in the block layer
    code. This caused problems for some SMR drives.

    Debugging this issue revealed a few problems with the existing
    infrastructure since the block layer didn't know how to deal with
    device-imposed limits, only limits set by the I/O controller.

    - Introduce a new queue limit, max_dev_sectors, which is used by the
    ULD to signal the maximum sectors for a REQ_TYPE_FS request.

    - Ensure that max_dev_sectors is correctly stacked and taken into
    account when overriding max_sectors through sysfs.

    - Rework sd_read_block_limits() so it saves the max_xfer and opt_xfer
    values for later processing.

    - In sd_revalidate() set the queue's max_dev_sectors based on the
    MAXIMUM TRANSFER LENGTH value in the Block Limits VPD. If this value
    is not reported, fall back to a cap based on the CDB TRANSFER LENGTH
    field size.

    - In sd_revalidate(), use OPTIMAL TRANSFER LENGTH from the Block Limits
    VPD--if reported and sane--to signal the preferred device transfer
    size for FS requests. Otherwise use BLK_DEF_MAX_SECTORS.

    - blk_limits_max_hw_sectors() is no longer used and can be removed.

    Signed-off-by: Martin K. Petersen
    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=93581
    Reviewed-by: Christoph Hellwig
    Tested-by: sweeneygj@gmx.com
    Tested-by: Arzeets
    Tested-by: David Eisner
    Tested-by: Mario Kicherer
    Signed-off-by: Martin K. Petersen

    Martin K. Petersen
     
  • A device may report an OPTIMAL UNMAP GRANULARITY and UNMAP GRANULARITY
    ALIGNMENT in the Block Limits VPD. These parameters describe the
    device's internal provisioning allocation units. By default the block
    layer will round and align any discard requests based on these limits.

    If a device reports LBPRZ=1 to guarantee zeroes after discard, however,
    it is imperative that the block layer does not leave out any parts of
    the requested block range. Otherwise the device can not do the required
    zeroing of any partial allocation units and this can lead to data
    corruption.

    Since the dm thinp personality relies on the block layer's current
    behavior and is unable to deal with partial discard blocks we work
    around the problem by setting the granularity to match the logical block
    size when LBPRZ is enabled.

    Signed-off-by: Martin K. Petersen
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Johannes Thumshirn
    Signed-off-by: Martin K. Petersen

    Martin K. Petersen
     

14 Nov, 2015

1 commit

  • Pull final round of SCSI updates from James Bottomley:
    "Sorry for the delay in this patch which was mostly caused by getting
    the merger of the mpt2/mpt3sas driver, which was seen as an essential
    item of maintenance work to do before the drivers diverge too much.
    Unfortunately, this caused a compile failure (detected by linux-next),
    which then had to be fixed up and incubated.

    In addition to the mpt2/3sas rework, there are updates from pm80xx,
    lpfc, bnx2fc, hpsa, ipr, aacraid, megaraid_sas, storvsc and ufs plus
    an assortment of changes including some year 2038 issues, a fix for a
    remove before detach issue in some drivers and a couple of other minor
    issues"

    * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (141 commits)
    mpt3sas: fix inline markers on non inline function declarations
    sd: Clear PS bit before Mode Select.
    ibmvscsi: set max_lun to 32
    ibmvscsi: display default value for max_id, max_lun and max_channel.
    mptfusion: don't allow negative bytes in kbuf_alloc_2_sgl()
    scsi: pmcraid: replace struct timeval with ktime_get_real_seconds()
    mvumi: 64bit value for seconds_since1970
    be2iscsi: Fix bogus WARN_ON length check
    scsi_scan: don't dump trace when scsi_prep_async_scan() is called twice
    mpt3sas: Bump mpt3sas driver version to 09.102.00.00
    mpt3sas: Single driver module which supports both SAS 2.0 & SAS 3.0 HBAs
    mpt2sas, mpt3sas: Update the driver versions
    mpt3sas: setpci reset kernel oops fix
    mpt3sas: Added OEM Gen2 PnP ID branding names
    mpt3sas: Refcount fw_events and fix unsafe list usage
    mpt3sas: Refcount sas_device objects and fix unsafe list usage
    mpt3sas: sysfs attribute to report Backup Rail Monitor Status
    mpt3sas: Ported WarpDrive product SSS6200 support
    mpt3sas: fix for driver fails EEH, recovery from injected pci bus error
    mpt3sas: Manage MSI-X vectors according to HBA device type
    ...

    Linus Torvalds
     

12 Nov, 2015

1 commit

  • According to SPC-4, in a Mode Select, the PS bit in Mode Pages is
    reserved and must be set to 0 by the driver. In the sd implementation,
    function cache_type_store does a Mode Sense, which might set the PS bit
    on the read buffer, followed by a Mode Select, which receives the same
    buffer, without explicitly clearing the PS bit. So, in cases where
    target supports saving the Mode Page to a non-volatile location, we end
    up doing a Mode Select with the PS bit set, which could cause an illegal
    request error if the target is checking this.

    This was observed on a new firmware change, which was subsequently
    reverted, but this changes sd.c to be more compliant with SPC-4.

    This patch clears the PS bit in the buffer returned by Mode Select,
    right before it is used in the Mode Select command.

    Signed-off-by: Gabriel Krisman Bertazi
    Signed-off-by: Martin K. Petersen

    Gabriel Krisman Bertazi
     

05 Nov, 2015

1 commit

  • Pull block reservation support from Jens Axboe:
    "This adds support for persistent reservations, both at the core level,
    as well as for sd and NVMe"

    [ Background from the docs: "Persistent Reservations allow restricting
    access to block devices to specific initiators in a shared storage
    setup. All implementations are expected to ensure the reservations
    survive a power loss and cover all connections in a multi path
    environment" ]

    * 'for-4.4/reservations' of git://git.kernel.dk/linux-block:
    NVMe: Precedence error in nvme_pr_clear()
    nvme: add missing endianess annotations in nvme_pr_command
    NVMe: Add persistent reservation ops
    sd: implement the Persistent Reservation API
    block: add an API for Persistent Reservations
    block: cleanup blkdev_ioctl

    Linus Torvalds
     

22 Oct, 2015

2 commits


03 Sep, 2015

1 commit

  • Pull core block updates from Jens Axboe:
    "This first core part of the block IO changes contains:

    - Cleanup of the bio IO error signaling from Christoph. We used to
    rely on the uptodate bit and passing around of an error, now we
    store the error in the bio itself.

    - Improvement of the above from myself, by shrinking the bio size
    down again to fit in two cachelines on x86-64.

    - Revert of the max_hw_sectors cap removal from a revision again,
    from Jeff Moyer. This caused performance regressions in various
    tests. Reinstate the limit, bump it to a more reasonable size
    instead.

    - Make /sys/block//queue/discard_max_bytes writeable, by me.
    Most devices have huge trim limits, which can cause nasty latencies
    when deleting files. Enable the admin to configure the size down.
    We will look into having a more sane default instead of UINT_MAX
    sectors.

    - Improvement of the SGP gaps logic from Keith Busch.

    - Enable the block core to handle arbitrarily sized bios, which
    enables a nice simplification of bio_add_page() (which is an IO hot
    path). From Kent.

    - Improvements to the partition io stats accounting, making it
    faster. From Ming Lei.

    - Also from Ming Lei, a basic fixup for overflow of the sysfs pending
    file in blk-mq, as well as a fix for a blk-mq timeout race
    condition.

    - Ming Lin has been carrying Kents above mentioned patches forward
    for a while, and testing them. Ming also did a few fixes around
    that.

    - Sasha Levin found and fixed a use-after-free problem introduced by
    the bio->bi_error changes from Christoph.

    - Small blk cgroup cleanup from Viresh Kumar"

    * 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 commits)
    blk: Fix bio_io_vec index when checking bvec gaps
    block: Replace SG_GAPS with new queue limits mask
    block: bump BLK_DEF_MAX_SECTORS to 2560
    Revert "block: remove artifical max_hw_sectors cap"
    blk-mq: fix race between timeout and freeing request
    blk-mq: fix buffer overflow when reading sysfs file of 'pending'
    Documentation: update notes in biovecs about arbitrarily sized bios
    block: remove bio_get_nr_vecs()
    fs: use helper bio_add_page() instead of open coding on bi_io_vec
    block: kill merge_bvec_fn() completely
    md/raid5: get rid of bio_fits_rdev()
    md/raid5: split bio for chunk_aligned_read
    block: remove split code in blkdev_issue_{discard,write_same}
    btrfs: remove bio splitting and merge_bvec_fn() calls
    bcache: remove driver private bio splitting code
    block: simplify bio_add_page()
    block: make generic_make_request handle arbitrarily sized bios
    blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL)
    block: don't access bio->bi_error after bio_put()
    block: shrink struct bio down to 2 cache lines again
    ...

    Linus Torvalds
     

13 Aug, 2015

1 commit

  • Commit bcdb247c6b6a ("sd: Limit transfer length") clamped the maximum
    size of an I/O request to the MAXIMUM TRANSFER LENGTH field in the BLOCK
    LIMITS VPD. This had the unfortunate effect of also limiting the maximum
    size of non-filesystem requests sent to the device through sg/bsg.

    Avoid using blk_queue_max_hw_sectors() and set the max_sectors queue
    limit directly.

    Also update the comment in blk_limits_max_hw_sectors() to clarify that
    max_hw_sectors defines the limit for the I/O controller only.

    Signed-off-by: Martin K. Petersen
    Reported-by: Brian King
    Tested-by: Brian King
    Cc: stable@vger.kernel.org # 3.17+
    Signed-off-by: James Bottomley

    Martin K. Petersen
     

17 Jul, 2015

1 commit


25 May, 2015

1 commit


19 May, 2015

1 commit


17 Apr, 2015

1 commit

  • The new integrity code did not correctly unregister the profile for SD
    disks. Call blk_integrity_unregister() when we release a disk.

    Signed-off-by: Martin K. Petersen
    Reported-by: Sagi Grimberg
    Tested-by: Sagi Grimberg
    Cc: stable@vger.kernel.org # v3.17+
    Signed-off-by: James Bottomley

    Martin K. Petersen
     

11 Apr, 2015

1 commit

  • The current string_get_size() overflows when the device size goes over
    2^64 bytes because the string helper routine computes the suffix from
    the size in bytes. However, the entirety of Linux thinks in terms of
    blocks, not bytes, so this will artificially induce an overflow on very
    large devices. Fix this by making the function string_get_size() take
    blocks and the block size instead of bytes. This should allow us to
    keep working until the current SCSI standard overflows.

    Also fix virtio_blk and mmc (both of which were also artificially
    multiplying by the block size to pass a byte side to string_get_size()).

    The mathematics of this is pretty simple: we're taking a product of
    size in blocks (S) and block size (B) and trying to re-express this in
    exponential form: S*B = R*N^E (where N, the exponent is either 1000 or
    1024) and R < N. Mathematically, S = RS*N^ES and B=RB*N^EB, so if RS*RB
    < N it's easy to see that S*B = RS*RB*N^(ES+EB). However, if RS*BS > N,
    we can see that this can be re-expressed as RS*BS = R*N (where R =
    RS*BS/N < N) so the whole exponent becomes R*N^(ES+EB+1)

    [jejb: fix incorrect 32 bit do_div spotted by kbuild test robot ]
    Acked-by: Ulf Hansson
    Reviewed-by: Andrew Morton
    Signed-off-by: James Bottomley

    James Bottomley
     

19 Mar, 2015

1 commit

  • The device model already takes care of races between ->remove and
    ->shutdown vs its other methods, and we now take care about locking
    them out for ->rescan as well.

    This is a partial revert of commit 39b7f1 ("[SCSI] sd: Fix refcounting").

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Hannes Reinecke
    Signed-off-by: James Bottomley

    Christoph Hellwig
     

12 Feb, 2015

1 commit

  • Pull first round of SCSI updates from James Bottomley:
    "This is the usual grab bag of driver updates (hpsa, storvsc, mp2sas,
    megaraid_sas, ses) plus an assortment of minor updates.

    There's also an update to ufs which adds new phy drivers and finally a
    new logging infrastructure for SCSI"

    * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (114 commits)
    scsi_logging: return void for dev_printk() functions
    scsi: print single-character strings with seq_putc
    scsi: merge consecutive seq_puts calls
    scsi: replace seq_printf with seq_puts
    aha152x: replace seq_printf with seq_puts
    advansys: replace seq_printf with seq_puts
    scsi: remove SPRINTF macro
    sg: remove an unused variable
    hpsa: Use local workqueues instead of system workqueues
    hpsa: add in P840ar controller model name
    hpsa: add in gen9 controller model names
    hpsa: detect and report failures changing controller transport modes
    hpsa: shorten the wait for the CISS doorbell mode change ack
    hpsa: refactor duplicated scan completion code into a new routine
    hpsa: move SG descriptor set-up out of hpsa_scatter_gather()
    hpsa: do not use function pointers in fast path command submission
    hpsa: print CDBs instead of kernel virtual addresses for uncommon errors
    hpsa: do not use a void pointer for scsi_cmd field of struct CommandList
    hpsa: return failed from device reset/abort handlers
    hpsa: check for ctlr lockup after command allocation in main io path
    ...

    Linus Torvalds
     

02 Feb, 2015

1 commit

  • The following patch fixes an issue observed with 4k sector disks
    where the max_hw_sectors attribute was getting set too large in
    sd_revalidate_disk. Since sdkp->max_xfer_blocks is in units
    of SCSI logical blocks and queue_max_hw_sectors is in units of
    512 byte blocks, on a 4k sector disk, every time we went through
    sd_revalidate_disk, we were taking the current value of
    queue_max_hw_sectors and increasing it by a factor of 8. Fix
    this by only shifting sdkp->max_xfer_blocks.

    Cc: stable@vger.kernel.org
    Signed-off-by: Brian King
    Reviewed-by: Paolo Bonzini
    Reviewed-by: Martin K. Petersen
    Signed-off-by: Christoph Hellwig

    Brian King
     

09 Jan, 2015

1 commit


30 Dec, 2014

1 commit

  • 7985090aa020 changed the discard heuristics to give preference to the
    WRITE SAME commands that (unlike UNMAP) guarantee deterministic results.

    Ming Lei discovered that QEMU SCSI's WRITE SAME implementation
    internally relied on limits that were only communicated for the UNMAP
    case. And therefore discard commands backed by WRITE SAME would fail.

    Tweak the heuristics so we still pick UNMAP in the LBPRZ=0 case and only
    prefer the WRITE SAME variants if the device has the LBPRZ flag set.

    Reported-by: Ming Lei
    Tested-by: Ming Lei
    Signed-off-by: Martin K. Petersen
    Acked-by: Paolo Bonzini
    Signed-off-by: Christoph Hellwig

    Martin K. Petersen
     

25 Nov, 2014

3 commits

  • SPC-3 defines SERVICE ACTION IN(12) and SERVICE ACTION IN(16).
    So rename SERVICE_ACTION_IN to SERVICE_ACTION_IN_16 to be
    consistent with SPC and to allow for better distinction.

    Signed-off-by: Hannes Reinecke
    Tested-by: Robert Elliott
    Reviewed-by: Robert Elliott
    Signed-off-by: Christoph Hellwig

    Hannes Reinecke
     
  • The driver core driver structure has grown an owner field and now
    requires it to be set for all modular drivers. Set it up for
    all scsi_driver instances and get rid of the now superflous
    scsi_driver owner field.

    Signed-off-by: Christoph Hellwig
    Reported-by: Shane M Seymour
    Reviewed-by: Ewan D. Milne

    Christoph Hellwig
     
  • There is no reason for ULDs to pass in a flag on how to allocate the S/G
    lists. While we don't need GFP_ATOMIC for the blk-mq case because we
    don't hold locks, that decision can be made way down the chain without
    having to pass a pointless gfp_mask argument.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Bart Van Assche
    Reviewed-by: Hannes Reinecke

    Christoph Hellwig
     

12 Nov, 2014

6 commits

  • The T10 SBC UNMAP command does not provide any hard guarantees that
    blocks will return zeroes on a subsequent READ. This is due to the fact
    that the device server is free to silently ignore all or parts of the
    request.

    The only way to ensure that a block consistently returns zeroes after
    being unmapped is to use WRITE SAME with the UNMAP bit set. Should the
    device be unable to unmap one or more blocks described by the command it
    is required to manually write zeroes to them.

    Until now we have preferred UNMAP over the WRITE SAME variants to
    accommodate thinly provisioned devices that predated the final SBC-3
    spec. This patch changes the heuristic so that we favor WRITE SAME(16)
    or (10) over UNMAP if these commands are marked as supported in the
    Logical Block Provisioning VPD page.

    The patch also disables discard_zeroes_data for devices operating in
    UNMAP mode.

    Signed-off-by: Martin K. Petersen
    Reviewed-by: Paolo Bonzini
    Signed-off-by: Christoph Hellwig

    Martin K. Petersen
     
  • No need to verify the passthrough ioctls, the real handler will
    take care of that. Also make sure not to block for resets on
    O_NONBLOCK fds.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Hannes Reinecke

    Christoph Hellwig
     
  • The calling conventions for this function are bad as it could return
    -ENODEV both for a device not currently online and a not recognized ioctl.

    Add a new scsi_ioctl_block_when_processing_errors function that wraps
    scsi_block_when_processing_errors with the a special case for the
    SG_SCSI_RESET ioctl command, and handle the SG_SCSI_RESET case itself
    in scsi_ioctl. All callers of scsi_ioctl now must call the above helper
    to check for the EH state, so that the ioctl handler itself doesn't
    have to.

    Reported-by: Robert Elliott
    Signed-off-by: Christoph Hellwig
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Hannes Reinecke

    Christoph Hellwig
     
  • Open-code scsi_print_result in sd.c, and cleanup logging to
    not print duplicate informations.
    Also remove the call to scsi_show_result() in ufshcd.c
    to be consistent with other callers of scsi_execute().

    With that we can remove scsi_show_result in constants.c

    Signed-off-by: Hannes Reinecke
    Reviewed-by: Robert Elliott
    Signed-off-by: Christoph Hellwig

    Hannes Reinecke
     
  • We should be using the standard dev_printk() variants for
    sense code printing.

    [hch: remove __scsi_print_sense call in xen-scsiback, Acked by Juergen]
    [hch: folded bracing fix from Dan Carpenter]
    Signed-off-by: Hannes Reinecke
    Reviewed-by: Robert Elliott
    Signed-off-by: Christoph Hellwig

    Hannes Reinecke
     
  • sd_done() was calling scsi_print_sense() for a sense code
    of 'NO_SENSE'.

    Signed-off-by: Hannes Reinecke
    Reviewed-by: Robert Elliott
    Signed-off-by: Christoph Hellwig

    Hannes Reinecke
     

19 Oct, 2014

2 commits

  • Pull block layer driver update from Jens Axboe:
    "This is the block driver pull request for 3.18. Not a lot in there
    this round, and nothing earth shattering.

    - A round of drbd fixes from the linbit team, and an improvement in
    asender performance.

    - Removal of deprecated (and unused) IRQF_DISABLED flag in rsxx and
    hd from Michael Opdenacker.

    - Disable entropy collection from flash devices by default, from Mike
    Snitzer.

    - A small collection of xen blkfront/back fixes from Roger Pau Monné
    and Vitaly Kuznetsov"

    * 'for-3.18/drivers' of git://git.kernel.dk/linux-block:
    block: disable entropy contributions for nonrot devices
    xen, blkfront: factor out flush-related checks from do_blkif_request()
    xen-blkback: fix leak on grant map error path
    xen/blkback: unmap all persistent grants when frontend gets disconnected
    rsxx: Remove deprecated IRQF_DISABLED
    block: hd: remove deprecated IRQF_DISABLED
    drbd: use RB_DECLARE_CALLBACKS() to define augment callbacks
    drbd: compute the end before rb_insert_augmented()
    drbd: Add missing newline in resync progress display in /proc/drbd
    drbd: reduce lock contention in drbd_worker
    drbd: Improve asender performance
    drbd: Get rid of the WORK_PENDING macro
    drbd: Get rid of the __no_warn and __cond_lock macros
    drbd: Avoid inconsistent locking warning
    drbd: Remove superfluous newline from "resync_extents" debugfs entry.
    drbd: Use consistent names for all the bi_end_io callbacks
    drbd: Use better variable names

    Linus Torvalds
     
  • Pull core block layer changes from Jens Axboe:
    "This is the core block IO pull request for 3.18. Apart from the new
    and improved flush machinery for blk-mq, this is all mostly bug fixes
    and cleanups.

    - blk-mq timeout updates and fixes from Christoph.

    - Removal of REQ_END, also from Christoph. We pass it through the
    ->queue_rq() hook for blk-mq instead, freeing up one of the request
    bits. The space was overly tight on 32-bit, so Martin also killed
    REQ_KERNEL since it's no longer used.

    - blk integrity updates and fixes from Martin and Gu Zheng.

    - Update to the flush machinery for blk-mq from Ming Lei. Now we
    have a per hardware context flush request, which both cleans up the
    code should scale better for flush intensive workloads on blk-mq.

    - Improve the error printing, from Rob Elliott.

    - Backing device improvements and cleanups from Tejun.

    - Fixup of a misplaced rq_complete() tracepoint from Hannes.

    - Make blk_get_request() return error pointers, fixing up issues
    where we NULL deref when a device goes bad or missing. From Joe
    Lawrence.

    - Prep work for drastically reducing the memory consumption of dm
    devices from Junichi Nomura. This allows creating clone bio sets
    without preallocating a lot of memory.

    - Fix a blk-mq hang on certain combinations of queue depths and
    hardware queues from me.

    - Limit memory consumption for blk-mq devices for crash dump
    scenarios and drivers that use crazy high depths (certain SCSI
    shared tag setups). We now just use a single queue and limited
    depth for that"

    * 'for-3.18/core' of git://git.kernel.dk/linux-block: (58 commits)
    block: Remove REQ_KERNEL
    blk-mq: allocate cpumask on the home node
    bio-integrity: remove the needless fail handle of bip_slab creating
    block: include func name in __get_request prints
    block: make blk_update_request print prefix match ratelimited prefix
    blk-merge: don't compute bi_phys_segments from bi_vcnt for cloned bio
    block: fix alignment_offset math that assumes io_min is a power-of-2
    blk-mq: Make bt_clear_tag() easier to read
    blk-mq: fix potential hang if rolling wakeup depth is too high
    block: add bioset_create_nobvec()
    block: use bio_clone_fast() in blk_rq_prep_clone()
    block: misplaced rq_complete tracepoint
    sd: Honor block layer integrity handling flags
    block: Replace strnicmp with strncasecmp
    block: Add T10 Protection Information functions
    block: Don't merge requests if integrity flags differ
    block: Integrity checksum flag
    block: Relocate bio integrity flags
    block: Add a disk flag to block integrity profile
    block: Add prefix to block integrity profile flags
    ...

    Linus Torvalds
     

05 Oct, 2014

1 commit

  • Clear QUEUE_FLAG_ADD_RANDOM in all block drivers that set
    QUEUE_FLAG_NONROT.

    Historically, all block devices have automatically made entropy
    contributions. But as previously stated in commit e2e1a148 ("block: add
    sysfs knob for turning off disk entropy contributions"):
    - On SSD disks, the completion times aren't as random as they
    are for rotational drives. So it's questionable whether they
    should contribute to the random pool in the first place.
    - Calling add_disk_randomness() has a lot of overhead.

    There are more reliable sources for randomness than non-rotational block
    devices. From a security perspective it is better to err on the side of
    caution than to allow entropy contributions from unreliable "random"
    sources.

    Signed-off-by: Mike Snitzer
    Signed-off-by: Jens Axboe

    Mike Snitzer
     

01 Oct, 2014

1 commit

  • A set of flags introduced in the block layer enable better control over
    how protection information is handled. These flags are useful for both
    error injection and data recovery purposes. Checking can be enabled and
    disabled for controller and disk, and the guard tag format is now a
    per-I/O property.

    Update sd_protect_op to communicate the relevant information to the
    low-level device driver via a set of flags in scsi_cmnd.

    Signed-off-by: Martin K. Petersen
    Reviewed-by: Sagi Grimberg
    Acked-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Martin K. Petersen
     

16 Sep, 2014

1 commit

  • SCSI Well-known logical units generally don't have any scsi driver
    associated with it which means no one will call scsi_autopm_put_device()
    on these wlun scsi devices and this would result in keeping the
    corresponding scsi device always active (hence LLD can't be suspended as
    well). Same exact problem can be seen for other scsi device representing
    normal logical unit whose driver is yet to be loaded. This patch fixes
    the above problem with this approach:

    - make the scsi_autopm_put_device call at the end of scsi_sysfs_add_sdev
    to make it balance out the get earlier in the function.
    - let drivers do paired get/put calls in their probe methods.

    Signed-off-by: Subhash Jadavani
    Signed-off-by: Dolev Raviv
    Signed-off-by: Christoph Hellwig

    Subhash Jadavani