15 Sep, 2016

1 commit

  • Before using vendor-specific VPD pages for getting raid_level and
    device_id, check for page support. If page isn't supported, don't try
    to use it. Also, pay attention to return status on hpsa_get_device_id.

    [mkp: fix boolean return warnings reported by kbuild test robot]

    Reviewed-by: Scott Benesh
    Reviewed-by: Scott Teel
    Reviewed-by: Kevin Barnett
    Signed-off-by: Don Brace
    Signed-off-by: Martin K. Petersen

    Scott Teel
     

24 Feb, 2016

2 commits


07 Jan, 2016

1 commit


10 Nov, 2015

4 commits

  • Reviewed-by: Scott Teel
    Reviewed-by: Justin Lindley
    Reviewed-by: Kevin Barnett
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Tomas Henzl
    Reviewed-by: Matthew R. Ochs
    Signed-off-by: Don Brace
    Signed-off-by: Martin K. Petersen

    Kevin Barnett
     
  • When external target arrays are present, disable the firmware's
    normal behavior of returning a cached copy of the report lun data,
    and force it to collect new data each time we request a report luns.

    This is necessary for external arrays, since there may be no
    reliable signal from the external array to the smart array when
    lun configuration changes, and thus when driver requests
    report luns, it may be stale data.

    Use diag options to turn off RPL data caching.

    Reviewed-by: Scott Teel
    Reviewed-by: Justin Lindley
    Reviewed-by: Kevin Barnett
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Matthew R. Ochs
    Signed-off-by: Don Brace
    Signed-off-by: Martin K. Petersen

    Scott Teel
     
  • External array LUNs must use target and lun numbers assigned by the
    external array. So the driver must treat these differently from
    local LUNs when assigning lun/target.

    LUN's 'model' field has been used to detect Lun types that need
    special treatment, but the desire is to eliminate the need to reference
    specific array models, and support any external array.

    Pass-through RAID (PTRAID) luns are not luns of the local controller,
    so they are not reported in LUN count of command 'ID controller'.
    However, they ARE reported in "Report logical Luns" command.
    Local luns are listed first, then PTRAID LUNs.

    The number of luns from "Report LUNs" in excess of those reported by
    'ID controller' are therefore the PTRAID LUNS.

    We can now remove function is_ext_target, and the 'white list'
    array of supported model names.

    Reviewed-by: Scott Teel
    Reviewed-by: Justin Lindley
    Reviewed-by: Kevin Barnett
    Signed-off-by: Don Brace
    Reviewed-by: Matthew R. Ochs
    Signed-off-by: Martin K. Petersen

    Scott Teel
     
  • The driver is using two MACROs which seemingly are looking in
    the wrong location for the device_flags returned from
    CISS_REPORT_PHYS. Both MACROs, NON_DISK_PHYS_DEV and
    PHYS_IOACCEL, are using the pointer returned from figure_lunaddrbytes
    which is the address of the LUN.lunid element in
    the extended CISS_REPORT_PHYS. But the MACROS are using offsets
    beyond the range of the element (offset 17 of an 8 byte element).

    These MACROs actually are looking at the correct location but
    they fail static checker analysis. It also will not work
    if any new elements are added to the extended LUN structure.

    Change the code to use the structure elements directly
    since this MACRO is only used in one location.

    Reported-by: Dan Carpenter
    Reviewed-by: Scott Teel
    Reviewed-by: Justin Lindley
    Reviewed-by: Kevin Barnett
    Reviewed-by: Tomas Henzl
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Don Brace
    Signed-off-by: Martin K. Petersen

    Don Brace
     

27 Aug, 2015

2 commits

  • prevent adding volumes that are not available.

    Reviewed-by: Kevin Barnett
    Reviewed-by: Scott Teel
    Reviewed-by: Justin Lindley
    Reviewed-by: Tomas Henzl
    Signed-off-by: Don Brace
    Signed-off-by: James Bottomley

    Scott Benesh
     
  • need to add PMC to copyright notice and update the Hewlett-Packard
    copyright notification.

    Reviewed-by: Scott Teel
    Reviewed-by: Kevin Barnett
    Reviewed-by: Justin Lindley
    Signed-off-by: Don Brace
    Signed-off-by: James Bottomley

    Don Brace
     

01 Jun, 2015

8 commits

  • Synchronize completion the reset with completion of outstanding commands

    Extending the newly-added synchronous abort functionality,
    now also synchronize resets with the completion of outstanding commands.
    Rename the wait queue to reflect the fact that it's being used for both
    types of waits. Also, don't complete commands which are terminated
    due to a reset operation.

    fix for controller lockup during reset

    Reviewed-by: Scott Teel
    Reviewed-by: Kevin Barnett
    Reviewed-by: Tomas Henzl
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Webb Scales
    Signed-off-by: Don Brace
    Reviewed-by: Christoph Hellwig
    Signed-off-by: James Bottomley

    Webb Scales
     
  • Don't return from the abort request until the target command is complete.
    Mark outstanding commands which have a pending abort, and do not send them
    to the host if we can avoid it.

    If the current command has been aborted, do not call the SCSI command
    completion routine from the I/O path: when the abort returns successfully,
    the SCSI mid-layer will handle the completion implicitly.

    The following race was possible in theory.

    1. LLD is requested to abort a scsi command
    2. scsi command completes
    3. The struct CommandList associated with 2 is made available.
    4. new io request to LLD to another LUN re-uses struct CommandList
    5. abort handler follows scsi_cmnd->host_scribble and
    finds struct CommandList and tries to aborts it.

    Now we have aborted the wrong command.

    Fix by resetting the scsi_cmd field of struct CommandList
    upon completion and making the abort handler check that
    the scsi_cmd pointer in the CommadList struct matches the
    scsi_cmnd that it has been asked to abort.

    Reviewed-by: Scott Teel
    Reviewed-by: Kevin Barnett
    Reviewed-by: Tomas Henzl
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Webb Scales
    Signed-off-by: Don Brace
    Reviewed-by: Christoph Hellwig
    Signed-off-by: James Bottomley

    Webb Scales
     
  • add support for tmf when in ioaccel2 mode

    Reviewed-by: Scott Teel
    Reviewed-by: Kevin Barnett
    Reviewed-by: Tomas Henzl
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Joe Handzik
    Signed-off-by: Don Brace
    Reviewed-by: Christoph Hellwig
    Signed-off-by: James Bottomley

    Stephen Cameron
     
  • improve ioaccel2 error handling, including better handling of
    underrun statuses

    Reviewed-by: Scott Teel
    Reviewed-by: Kevin Barnett
    Signed-off-by: Joe Handzik
    Signed-off-by: Don Brace
    Reviewed-by: Christoph Hellwig
    Signed-off-by: James Bottomley

    Joe Handzik
     
  • Factor out hpsa_cmd_init from cmd_alloc(). We also need
    this for resubmitting commands down the default RAID path
    when they have returned from the ioaccel paths with errors.

    In particular, reinitialize the cmd_type and busaddr fields as these
    will not be correct for submitting down the RAID stack path
    after ioaccel command completion.

    This saves time when submitting commands.

    Reviewed-by: Scott Teel
    Reviewed-by: Kevin Barnett
    Reviewed-by: Tomas Henzl
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Don Brace
    Reviewed-by: Christoph Hellwig
    Signed-off-by: James Bottomley

    Stephen Cameron
     
  • In hba mode, we could get sense data in descriptor format so
    we need to handle that.

    It's possible for CommandStatus to have value 0x0D
    "TMF Function Status", which we should handle. We will get
    this from a P1224 when aborting a non-existent tag, for
    example. The "ScsiStatus" field of the errinfo field
    will contain the TMF function status value.

    Reviewed-by: Scott Teel
    Reviewed-by: Kevin Barnett
    Reviewed-by: Tomas Henzl
    Signed-off-by: Don Brace
    Reviewed-by: Christoph Hellwig
    Signed-off-by: James Bottomley

    Stephen Cameron
     
  • Allow driver initiated commands to have a timeout. It does not
    yet try to do anything with timeouts on such commands.

    We are sending a reset in order to get rid of a command we want to abort.
    If we make it return on the same reply queue as the command we want to abort,
    the completion of the aborted command will not race with the completion of
    the reset command.

    Rename hpsa_scsi_do_simple_cmd_core() to hpsa_scsi_do_simple_cmd(), since
    this function is the interface for issuing commands to the controller and
    not the "core" of that implementation. Add a parameter to it which allows
    the caller to specify the reply queue to be used. Modify existing callers
    to specify the default reply queue.

    Rename __hpsa_scsi_do_simple_cmd_core() to hpsa_scsi_do_simple_cmd_core(),
    since this routine is the "core" implementation of the "do simple command"
    function and there is no longer any other function with a similar name.
    Modify the existing callers of this routine (other than
    hpsa_scsi_do_simple_cmd()) to instead call hpsa_scsi_do_simple_cmd(), since
    it will now accept the reply_queue paramenter, and it provides a controller
    lock-up check. (Also, tweak two related message strings to make them
    distinct from each other.)

    Submitting a command to a locked up controller always results in a timeout,
    so check for controller lock-up before submitting.

    This is to enable fixing a race between command completions and
    abort completions on different reply queues in a subsequent patch.
    We want to be able to specify which reply queue an abort completion
    should occur on so that it cannot race the completion of the command
    it is trying to abort.

    The following race was possible in theory:

    1. Abort command is sent to hardware.
    2. Command to be aborted simultaneously completes on another
    reply queue.
    3. Hardware receives abort command, decides command has already
    completed and indicates this to the driver via another different
    reply queue.
    4. driver processes abort completion finds that the hardware does not know
    about the command, concludes that therefore the command cannot complete,
    returns SUCCESS indicating to the mid-layer that the scsi_cmnd may be
    re-used.
    5. Command from step 2 is processed and completed back to scsi mid
    layer (after we already promised that would never happen.)

    Fix by forcing aborts to complete on the same reply queue as the command
    they are aborting.

    Piggybacking device rescanning functionality onto the lockup
    detection thread is not a good idea because if the controller
    locks up during device rescanning, then the thread could get
    stuck, then the lockup isn't detected. Use separate work
    queues for device rescanning and lockup detection.

    Detect controller lockup in abort handler.

    After a lockup is detected, return DO_NO_CONNECT which results in immediate
    termination of commands rather than DID_ERR which results in retries.

    Modify detect_controller_lockup() to return the result, to remove the need for
    a separate check.

    Reviewed-by: Scott Teel
    Reviewed-by: Kevin Barnett
    Signed-off-by: Webb Scales
    Signed-off-by: Don Brace
    Reviewed-by: Christoph Hellwig
    Signed-off-by: James Bottomley

    Webb Scales
     
  • Cache the ioaccel handle so that when we need to abort commands sent
    down the ioaccel2 path, we can look up the LUN ID in h->dev[] instead of
    having to do I/O to the controller.

    Add a field to elements in h->dev[] to keep track of how the device is exposed
    to the SCSI mid layer: Not at all, without an upper level driver
    (no_uld_attach) or normally exposed.

    Since masked physical devices are now present in h->dev[] array
    it would be perfectly possible to do

    echo scsi add-single-device 2 2 0 0 > /proc/scsi/scsi

    and bring them online. This was previously not allowed for masked
    physical devices.

    Ensure that the mapping of physical disks to logical drives gets updated in a
    consistent way when a RAID migration occurs and is not touched until updates
    to it are complete.

    now instead of doing CISS_REPORT_PHYSICAL to get the LUNID for
    the physical disk in hpsa_get_pdisk_of_ioaccel2(), just get
    it out of h->dev[] where we already have it cached.

    do not touch phys_disk[] for ioaccel enabled logical drives during rescan

    Reviewed-by: Scott Teel
    Reviewed-by: Kevin Barnett
    Reviewed-by: Tomas Henzl
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Don Brace
    Reviewed-by: Christoph Hellwig
    Signed-off-by: James Bottomley

    Stephen Cameron
     

03 Feb, 2015

6 commits

  • There's no reason for it to be a void *, it should be a struct scsi_cmnd *

    Reviewed-by: Scott Teel
    Signed-off-by: Don Brace
    Signed-off-by: Christoph Hellwig

    Stephen Cameron
     
  • This means changing the allocator to reference count commands.
    The reference count is now the authoritative indicator of whether a
    command is allocated or not. The h->cmd_pool_bits bitmap is now
    only a heuristic hint to speed up the allocation process, it is no
    longer the authoritative record of allocated commands.

    Since we changed the command allocator to use reference counting
    as the authoritative indicator of whether a command is allocated,
    fail_all_outstanding_cmds needs to use the reference count not
    h->cmd_pool_bits for this purpose.

    Fix hpsa_drain_accel_commands to use the reference count as the
    authoritative indicator of whether a command is allocated instead of
    the h->cmd_pool_bits bitmap.

    Reviewed-by: Scott Teel
    Signed-off-by: Don Brace
    Signed-off-by: Christoph Hellwig

    Webb Scales
     
  • When using the ioaccel submission methods, requests destined for RAID volumes
    are sometimes diverted to physical devices. The OS has no or limited
    knowledge of these physical devices, so it is up to the driver to avoid
    pushing the device too hard. It is better to honor the physical device queue
    limit rather than making the device spew zillions of TASK SET FULL responses.

    This is so that hpsa based devices support /sys/block/sdNN/device/queue_type
    of simple, which lets the SCSI midlayer automatically adjust the queue_depth
    based on TASK SET FULL and GOOD status.

    Adjust the queue depth for a new device after it is created based on the
    maximum queue depths of the physical devices that constitute the
    device. This drops the maximum queue depth from .can_queue of 1024 to
    something like 174 for single-drive RAID-0, 348 for two-drive RAID-1, etc.
    It also adjusts for the ratio of data to parity drives.

    Reviewed-by: Scott Teel
    Signed-off-by: Webb Scales
    Signed-off-by: Don Brace
    Signed-off-by: Christoph Hellwig

    Don Brace
     
  • Instead of kicking the commands all the way back to the mid
    layer, use a work queue. This enables having a mechanism for
    the driver to be able to resubmit the commands down the "normal"
    raid path without turning off the ioaccel feature entirely
    whenever an error is encountered on the ioaccel path, and
    prevent excessive rescanning of devices.

    Reviewed-by: Scott Teel
    Signed-off-by: Don Brace
    Signed-off-by: Christoph Hellwig

    Don Brace
     
  • By not doing maintaining a list of queued commands, we can eliminate some spin
    locking in the main i/o path and gain significant improvement in IOPS. Remove
    the queuing code and the code that calls it; remove now-unused interrupt code;
    remove DIRECT_LOOKUP_BIT.

    Now that the passthru commands share the same command pool as
    the main i/o path, and the total size of the pool is less than
    or equal to the number of commands that will fit in the hardware
    fifo, there is no need to check to see if we are exceeding the
    hardware fifo's depth.

    Reviewed-by: Scott Teel
    Reviewed-by: Robert Elliott
    Signed-off-by: Don Brace
    Signed-off-by: Christoph Hellwig

    Don Brace
     
  • Correct endiness issues reported by sparse. SA controllers are
    little endian. This patch ensures endiness correctness.

    Signed-off-by: Don Brace
    Reviewed-by: Scott Teel
    Reviewed-by: Webb Scales
    Signed-off-by: Christoph Hellwig

    Don Brace
     

20 Nov, 2014

3 commits

  • Using bit fields for hardware command fields isn't portable and
    relies on assumptions about how the compiler lays out the bits.
    We can fix this in the driver's internal command structure, but the
    ioctl interface we can't change because it is part of the
    userland ABI.

    Signed-off-by: Don Brace
    Reviewed-by: Webb Scales
    Signed-off-by: Christoph Hellwig

    Stephen M. Cameron
     
  • The hardware needs little endian scatter gather addresses and
    lengths but we were not bothering to convert from cpu byte
    order as we should have been. On Intel, this is all just
    a bunch of no-ops macros, but it makes the code endian-clean(er).

    Signed-off-by: Don Brace
    Signed-off-by: Robert Elliott
    Reviewed-by: Webb Scales
    Signed-off-by: Christoph Hellwig

    Stephen M. Cameron
     
  • We were allocating roughly double the amount of memory
    we should be due to ReportLUNdata and ExtendedReportLUNdata
    containing a non-zero sized array but adding extra memory
    to allocate as if the array were zero sized.

    Track the logical and physical sizes separately.
    Allocate the memory based on the specific data
    structure sizes.

    Signed-off-by: Don Brace
    Reviewed-by: Webb Scales
    Signed-off-by: Christoph Hellwig

    Stephen M. Cameron
     

02 Jun, 2014

4 commits


16 Mar, 2014

9 commits