23 Jan, 2015

1 commit

  • James reported:
    > After e513cc1 module: Remove stop_machine from module unloading,
    > module_refcount() is returning (unsigned long)-1 when called from within
    > a routine that runs in module_exit. This is confusing the scsi device
    > put code which is coded to detect a module_refcount() of zero for
    > running within a module exit routine and not try to do another
    > module_put. The fix is to restore the original behaviour of
    > module_refcount() and return zero if we're running inside an exit
    > routine.

    The correct fix is to turn try_module_get() into __module_get(), and
    always do the module_put().

    Acked-by: James Bottomley
    Signed-off-by: Rusty Russell

    Rusty Russell
     

18 Dec, 2014

1 commit


08 Dec, 2014

1 commit


04 Dec, 2014

2 commits

  • Dropping to untagged mode when ramping down a queue due to QUEUE FULL
    events has two problems:

    - nothing in the midlayer or drivers ever moves back to tagged mode
    during queue ramp up.
    - cmd_per_lun isn't the untagged queue depth for many modern drivers
    that can handle multiple untagged commands, and this is the only
    place in the midlayer assuming that.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Bart Van Assche
    Reviewed-by: Martin K. Petersen

    Christoph Hellwig
     
  • Since we got rid of ordered tag support in 2010 the prime use case of
    switching on and off ordered tags has been obsolete. The other function
    of enabling/disabling tagging entirely has only been correctly implemented
    by the 53c700 driver and isn't generally useful.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Bart Van Assche
    Reviewed-by: Martin K. Petersen

    Christoph Hellwig
     

25 Nov, 2014

1 commit


24 Nov, 2014

1 commit

  • Drop the now unused reason argument from the ->change_queue_depth method.
    Also add a return value to scsi_adjust_queue_depth, and rename it to
    scsi_change_queue_depth now that it can be used as the default
    ->change_queue_depth implementation.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Mike Christie
    Reviewed-by: Hannes Reinecke

    Christoph Hellwig
     

12 Nov, 2014

6 commits

  • Remove the tagged argument from scsi_adjust_queue_depth, and just let it
    handle the queue depth. For most drivers those two are fairly separate,
    given that most modern drivers don't care about the SCSI "tagged" status
    of a command at all, and many old drivers allow queuing of multiple
    untagged commands in the driver.

    Instead we start out with the ->simple_tags flag set before calling
    ->slave_configure, which is how all drivers actually looking at
    ->simple_tags except for one worke anyway. The one other case looks
    broken, but I've kept the behavior as-is for now.

    Except for that we only change ->simple_tags from the ->change_queue_type,
    and when rejecting a tag message in a single driver, so keeping this
    churn out of scsi_adjust_queue_depth is a clear win.

    Now that the usage of scsi_adjust_queue_depth is more obvious we can
    also remove all the trivial instances in ->slave_alloc or ->slave_configure
    that just set it to the cmd_per_lun default.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Mike Christie
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Martin K. Petersen

    Christoph Hellwig
     
  • Allow a driver to ask for block layer tags by setting .use_blk_tags in the
    host template, in which case it will always see a valid value in
    request->tag, similar to the behavior when using blk-mq. This means even
    SCSI "untagged" commands will now have a tag, which is especially useful
    when using a host-wide tag map.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Mike Christie
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Hannes Reinecke

    Christoph Hellwig
     
  • Remove the ordered_tags field, we haven't been issuing ordered tags based
    on it since the big barrier rework in 2010.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Mike Christie
    Reviewed-by: Bart Van Assche
    Reviewed-by: Martin K. Petersen

    Christoph Hellwig
     
  • Most drivers use exactly the same implementation, so provide it as a
    library function.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Bart Van Assche
    Reviewed-by: Mike Christie
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Hannes Reinecke

    Christoph Hellwig
     
  • Simplify scsi_log_(send|completion) by externalizing
    scsi_mlreturn_string() and always print the command address.

    Signed-off-by: Hannes Reinecke
    Reviewed-by: Robert Elliott
    Signed-off-by: Christoph Hellwig

    Hannes Reinecke
     
  • We should be using the standard dev_printk() variants for
    sense code printing.

    [hch: remove __scsi_print_sense call in xen-scsiback, Acked by Juergen]
    [hch: folded bracing fix from Dan Carpenter]
    Signed-off-by: Hannes Reinecke
    Reviewed-by: Robert Elliott
    Signed-off-by: Christoph Hellwig

    Hannes Reinecke
     

01 Oct, 2014

1 commit


16 Sep, 2014

1 commit

  • The SCSI specification requires that the second Command Data Byte
    should contain the LUN value in its high-order bits if the recipient
    device reports SCSI level 2 or below. Nevertheless, some USB
    mass-storage devices use those bits for other purposes in
    vendor-specific commands. Currently Linux has no way to send such
    commands, because the SCSI stack always overwrites the LUN bits.

    Testing shows that Windows 7 and XP do not store the LUN bits in the
    CDB when sending commands to a USB device. This doesn't matter if the
    device uses the Bulk-Only or UAS transports (which virtually all
    modern USB mass-storage devices do), as these have a separate
    mechanism for sending the LUN value.

    Therefore this patch introduces a flag in the Scsi_Host structure to
    inform the SCSI midlayer that a transport does not require the LUN
    bits to be stored in the CDB, and it makes usb-storage set this flag
    for all devices using the Bulk-Only transport. (UAS is handled by a
    separate driver, but it doesn't really matter because no SCSI-2 or
    lower device is at all likely to use UAS.)

    The patch also cleans up the code responsible for storing the LUN
    value by adding a bitflag to the scsi_device structure. The test for
    whether to stick the LUN value in the CDB can be made when the device
    is probed, and stored for future use rather than being made over and
    over in the fast path.

    Signed-off-by: Alan Stern
    Reported-by: Tiziano Bacocco
    Acked-by: Martin K. Petersen
    Acked-by: James Bottomley
    Signed-off-by: Christoph Hellwig

    Alan Stern
     

16 Aug, 2014

1 commit

  • If a scsi host driver specifies .cmd_len in it's scsi_host_template, a driver's
    private command pool is needed. scsi_find_host_cmd_pool() will locate it, but
    scsi_alloc_host_cmd_pool() isn't saving the pool address in the host template.

    This will result in an access error when the host is removed.

    Avoid the problem by saving the address of a new allocated command pool where
    it is expected.

    Signed-off-by: Juergen Gross
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Christoph Hellwig
    Fixes: 89d9a567952baec13e26ada3e438f1b642d66b6e
    Cc: stable@vger.kernel.org
    Signed-off-by: James Bottomley

    Juergen Gross
     

29 Jul, 2014

1 commit


26 Jul, 2014

3 commits

  • Add two new device types, most importantly the zoned block device
    one.

    Split from an earlier patch by Hannes Reinecke.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Martin K. Petersen

    Christoph Hellwig
     
  • This patch adds support for an alternate I/O path in the scsi midlayer
    which uses the blk-mq infrastructure instead of the legacy request code.

    Use of blk-mq is fully transparent to drivers, although for now a host
    template field is provided to opt out of blk-mq usage in case any unforseen
    incompatibilities arise.

    In general replacing the legacy request code with blk-mq is a simple and
    mostly mechanical transformation. The biggest exception is the new code
    that deals with the fact the I/O submissions in blk-mq must happen from
    process context, which slightly complicates the I/O completion handler.
    The second biggest differences is that blk-mq is build around the concept
    of preallocated requests that also include driver specific data, which
    in SCSI context means the scsi_cmnd structure. This completely avoids
    dynamic memory allocations for the fast path through I/O submission.

    Due the preallocated requests the MQ code path exclusively uses the
    host-wide shared tag allocator instead of a per-LUN one. This only
    affects drivers actually using the block layer provided tag allocator
    instead of their own. Unlike the old path blk-mq always provides a tag,
    although drivers don't have to use it.

    For now the blk-mq path is disable by defauly and must be enabled using
    the "use_blk_mq" module parameter. Once the remaining work in the block
    layer to make blk-mq more suitable for slow devices is complete I hope
    to make it the default and eventually even remove the old code path.

    Based on the earlier scsi-mq prototype by Nicholas Bellinger.

    Thanks to Bart Van Assche and Robert Elliot for testing, benchmarking and
    various sugestions and code contributions.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Webb Scales
    Acked-by: Jens Axboe
    Tested-by: Bart Van Assche
    Tested-by: Robert Elliott

    Christoph Hellwig
     
  • Seems like these counters are missing any sort of synchronization for
    updates, as a over 10 year old comment from me noted. Fix this by
    using atomic counters, and while we're at it also make sure they are
    in the same cacheline as the _busy counters and not needlessly stored
    to in every I/O completion.

    With the new model the _busy counters can temporarily go negative,
    so all the readers are updated to check for > 0 values. Longer
    term every successful I/O completion will reset the counters to zero,
    so the temporarily negative values will not cause any harm.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Webb Scales
    Acked-by: Jens Axboe
    Tested-by: Bart Van Assche
    Tested-by: Robert Elliott

    Christoph Hellwig
     

25 Jul, 2014

3 commits

  • Avoid taking the host-wide host_lock to check the per-host queue limit.
    Instead we do an atomic_inc_return early on to grab our slot in the queue,
    and if necessary decrement it after finishing all checks.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Webb Scales
    Acked-by: Jens Axboe
    Tested-by: Bart Van Assche
    Tested-by: Robert Elliott

    Christoph Hellwig
     
  • The blk-mq code path will set this to a different function, so make the
    code simpler by setting it up in a legacy-request specific place.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Webb Scales
    Acked-by: Jens Axboe
    Tested-by: Bart Van Assche
    Tested-by: Robert Elliott

    Christoph Hellwig
     
  • Make sure we only have the logic for requeing commands in one place.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Webb Scales
    Acked-by: Jens Axboe
    Tested-by: Bart Van Assche
    Tested-by: Robert Elliott

    Christoph Hellwig
     

18 Jul, 2014

5 commits

  • While checking what scsi_adjust_queue_depth() did I thought its switch
    statement could be clearer:

    - remove redundant assignment (to sdev->queue_depth)
    - re-order cases (thus removing the fall-through)

    Signed-off-by: Douglas Gilbert
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Robert Elliott
    Tested-by: Robert Elliott
    Signed-off-by: Christoph Hellwig

    Douglas Gilbert
     
  • Signed-off-by: Christoph Hellwig
    Reviewed-by: Paolo Bonzini
    Reviewed-by: Hannes Reinecke

    Christoph Hellwig
     
  • Using dev_printk variants prefixes the logging message with
    the originating device, which makes debugging easier.

    Signed-off-by: Hannes Reinecke
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Christoph Hellwig

    Hannes Reinecke
     
  • The SCSI standard defines 64-bit values for LUNs, and large arrays
    employing large or hierarchical LUN numbers become more and more
    common.

    So update the linux SCSI stack to use 64-bit LUN numbers.

    Signed-off-by: Hannes Reinecke
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Ewan Milne
    Signed-off-by: Christoph Hellwig

    Hannes Reinecke
     
  • scsi_put_command() is either invoked before blk_start_request() or
    after block layer processing has completed. scsi_cmnd.abort_work
    is scheduled from inside the SCSI timeout handler. The block layer
    guarantees that either the regular completion handler
    (softirq_done_fn()) or the timeout handler (rq_timed_out_fn()) is
    invoked but not both. This means that scsi_put_command() is never
    invoked while abort_work is scheduled. Hence remove the
    cancel_delayed_work() call from scsi_put_command().

    Similarly, scsi_abort_command() is only invoked from the SCSI
    timeout handler. If scsi_abort_command() is invoked for a SCSI
    command with the SCSI_EH_ABORT_SCHEDULED flag set this means that
    scmd_eh_abort_handler() has already invoked scsi_queue_insert() and
    hence that scsi_cmnd.abort_work is no longer pending. Hence also
    remove the cancel_delayed_work() call from scsi_abort_command().

    Signed-off-by: Bart Van Assche
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Christoph Hellwig

    Bart Van Assche
     

12 Apr, 2014

1 commit


11 Apr, 2014

1 commit

  • async_schedule() sd resume work to allow disks and other devices to
    resume in parallel.

    This moves the entirety of scsi_device resume to an async context to
    ensure that scsi_device_resume() remains ordered with respect to the
    completion of the start/stop command. For the duration of the resume,
    new command submissions (that do not originate from the scsi-core) will
    be deferred (BLKPREP_DEFER).

    It adds a new ASYNC_DOMAIN_EXCLUSIVE(scsi_sd_pm_domain) as a container
    of these operations. Like scsi_sd_probe_domain it is flushed at
    sd_remove() time to ensure async ops do not continue past the
    end-of-life of the sdev. The implementation explicitly refrains from
    reusing scsi_sd_probe_domain directly for this purpose as it is flushed
    at the end of dpm_resume(), potentially defeating some of the benefit.
    Given sdevs are quiesced it is permissible for these resume operations
    to bleed past the async_synchronize_full() calls made by the driver
    core.

    We defer the resolution of which pm callback to call until
    scsi_dev_type_{suspend|resume} time and guarantee that the callback
    parameter is never NULL. With this in place the type of resume
    operation is encoded in the async function identifier.

    There is a concern that async resume could trigger PSU overload. In the
    enterprise, storage enclosures enforce staggered spin-up regardless of
    what the kernel does making async scanning safe by default. Outside of
    that context a user can disable asynchronous scanning via a kernel
    command line or CONFIG_SCSI_SCAN_ASYNC. Honor that setting when
    deciding whether to do resume asynchronously.

    Inspired by Todd's analysis and initial proposal [2]:
    https://01.org/suspendresume/blogs/tebrandt/2013/hard-disk-resume-optimization-simpler-approach

    Cc: Len Brown
    Cc: Phillip Susi
    [alan: bug fix and clean up suggestion]
    Acked-by: Alan Stern
    Suggested-by: Todd Brandt
    [djbw: kick all resume work to the async queue]
    Signed-off-by: Dan Williams

    Dan Williams
     

27 Mar, 2014

5 commits

  • This allows drivers to specify the size of their per-command private
    data in the host template and then get extra memory allocated for
    each command instead of needing another allocation in ->queuecommand.

    With the current SCSI code that already does multiple allocations for
    each command this probably doesn't make a big performance impact, but
    it allows to clean up the drivers, and prepare them for using the
    blk-mq infrastructure where the common allocation will make a difference.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Paolo Bonzini
    Signed-off-by: James Bottomley

    Christoph Hellwig
     
  • Just have one level of alloc/free functions that take a host instead
    of two levels for the allocation and different calling conventions
    for the free.

    [fengguang.wu@intel.com: docbook problems spotted, now fixed]
    Signed-off-by: Christoph Hellwig
    Reviewed-by: Paolo Bonzini
    Signed-off-by: James Bottomley

    Christoph Hellwig
     
  • We don't use the passed in scsi command for anything, so just add a adapter-
    wide internal status to go along with the internal scb that is used unter
    int_mtx to pass back the return value and get rid of all the complexities
    and abuse of the scsi_cmnd structure.

    This gets rid of the only user of scsi_allocate_command/scsi_free_command,
    which can now be removed.

    [jejb: checkpatch fixes]
    Signed-off-by: Christoph Hellwig
    Acked-by: Adam Radford
    Signed-off-by: James Bottomley

    Christoph Hellwig
     
  • EVPD page 0x83 is used to uniquely identify the device.
    So instead of having each and every program issue a separate
    SG_IO call to retrieve this information it does make far more
    sense to display it in sysfs.

    Some older devices (most notably tapes) will only report reliable
    information in page 0x80 (Unit Serial Number). So export this
    in the sysfs attribute 'vpd_pg80'.

    [jejb: checkpatch fix]
    [hare: attach after transport configure]
    [fengguang.wu@intel.com: spotted problems with the original now fixed]
    Signed-off-by: Hannes Reinecke
    Signed-off-by: James Bottomley

    Hannes Reinecke
     
  • We should be returning the number of bytes of the
    requested VPD page in scsi_vpd_inquiry.
    This makes it easier for the caller to verify the
    required space.

    [jejb: fix up mm warning spotted by Sergey]
    Tested-by: Sergey Senozhatsky
    Signed-off-by: Hannes Reinecke
    Signed-off-by: James Bottomley

    Hannes Reinecke
     

16 Mar, 2014

2 commits


19 Dec, 2013

2 commits

  • The documentation has gone out-of-sync, so update it to
    the current status.

    Signed-off-by: Hannes Reinecke
    Signed-off-by: James Bottomley

    Hannes Reinecke
     
  • When a command runs into a timeout we need to send an 'ABORT TASK'
    TMF. This is typically done by the 'eh_abort_handler' LLDD callback.

    Conceptually, however, this function is a normal SCSI command, so
    there is no need to enter the error handler.

    This patch implements a new scsi_abort_command() function which
    invokes an asynchronous function scsi_eh_abort_handler() to
    abort the commands via the usual 'eh_abort_handler'.

    If abort succeeds the command is either retried or terminated,
    depending on the number of allowed retries. However, 'eh_eflags'
    records the abort, so if the retry would fail again the
    command is pushed onto the error handler without trying to
    abort it (again); it'll be cleared up from SCSI EH.

    [hare: smatch detected stray switch fixed]
    Signed-off-by: Hannes Reinecke
    Signed-off-by: James Bottomley

    Hannes Reinecke
     

25 Oct, 2013

1 commit