10 May, 2013

1 commit

  • These enums have been separate since the dawn of SAS, mainly because the
    latter is a procotol only enum and the former includes additional state
    for libsas. The dichotomy causes endless confusion about which one you
    should use where and leads to pointless warnings like this:

    drivers/scsi/mvsas/mv_sas.c: In function 'mvs_update_phyinfo':
    drivers/scsi/mvsas/mv_sas.c:1162:34: warning: comparison between 'enum sas_device_type' and 'enum sas_dev_type' [-Wenum-compare]

    Fix by eliminating one of them. The one kept is effectively the sas.h
    one, but call it sas_device_type and make sure the enums are all
    properly namespaced with the SAS_ prefix.

    Signed-off-by: James Bottomley

    James Bottomley
     

24 Aug, 2012

1 commit

  • libsas power management routines to suspend and recover the sas domain
    based on a model where the lldd is allowed and expected to be
    "forgetful".

    sas_suspend_ha - disable event processing allowing the lldd to take down
    links without concern for causing hotplug events.
    Regardless of whether the lldd actually posts link down
    messages libsas notifies the lldd that all
    domain_devices are gone.

    sas_prep_resume_ha - on the way back up before the lldd starts link
    training clean out any spurious events that were
    generated on the way down, and re-enable event
    processing

    sas_resume_ha - after the lldd has started and decided that all phys
    have posted link-up events this routine is called to let
    libsas start it's own timeout of any phys that did not
    resume. After the timeout an lldd can cancel the
    phy teardown by posting a link-up event.

    Storage for ex_change_count (u16) and phy_change_count (u8) are changed
    to int so they can be set to -1 to indicate 'invalidated'.

    Signed-off-by: Dan Williams
    Reviewed-by: Jacek Danecki
    Tested-by: Maciej Patelczyk
    Acked-by: Alan Stern
    Signed-off-by: James Bottomley

    Dan Williams
     

20 Jul, 2012

1 commit

  • When managing shost->host_eh_scheduled libata assumes that there is a
    1:1 shost-to-ata_port relationship. libsas creates a 1:N relationship
    so it needs to manage host_eh_scheduled cumulatively at the host level.
    The sched_eh and end_eh port port ops allow libsas to track when domain
    devices enter/leave the "eh-pending" state under ha->lock (previously
    named ha->state_lock, but it is no longer just a lock for ha->state
    changes).

    Since host_eh_scheduled indicates eh without backing commands pinning
    the device it can be deallocated at any time. Move the taking of the
    domain_device reference under the port_lock to guarantee that the
    ata_port stays around for the duration of eh.

    Reviewed-by: Jacek Danecki
    Acked-by: Jeff Garzik
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     

23 Apr, 2012

1 commit

  • This changes the ordering of initialization and probing events from:
    1/ allocate rphy in PORTE_BYTES_DMAED, DISCE_REVALIDATE_DOMAIN
    2/ allocate ata_port and schedule port probe in DISCE_PROBE
    ...to:
    1/ allocate ata_port in PORTE_BYTES_DMAED, DISCE_REVALIDATE_DOMAIN
    2/ allocate rphy in PORTE_BYTES_DMAED, DISCE_REVALIDATE_DOMAIN
    3/ schedule port probe in DISCE_PROBE

    This ordering prevents PHYE_SIGNAL_LOSS_EVENTS from sneaking in to
    destrory ata devices before they have been fully initialized:

    BUG: unable to handle kernel paging request at 0000000000003b10
    IP: [] sas_ata_end_eh+0x12/0x5e [libsas]
    ...
    [] sas_unregister_common_dev+0x78/0xc9 [libsas]
    [] sas_unregister_dev+0x4f/0xad [libsas]
    [] sas_unregister_domain_devices+0x7f/0xbf [libsas]
    [] sas_deform_port+0x61/0x1b8 [libsas]
    [] sas_phye_loss_of_signal+0x29/0x2b [libsas]

    ...and kills the awkward "sata domain_device briefly existing in the
    domain without an ata_port" state.

    Reported-by: Michal Kosciowski
    Signed-off-by: Dan Williams
    Acked-by: Jeff Garzik
    Signed-off-by: James Bottomley

    Dan Williams
     

01 Mar, 2012

4 commits

  • libsas ata error handling is already async but this does not help the
    scan case. Move initial link recovery out from under host->scan_mutex,
    and delay synchronization with eh until after all port probe/recovery
    work has been queued.

    Device ordering is maintained with scan order by still calling
    sas_rphy_add() in order of domain discovery.

    Since we now scan the domain list when invoking libata-eh we need to be
    careful to check for fully initialized ata ports.

    Acked-by: Jack Wang
    Acked-by: Jeff Garzik
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     
  • ata devices are always scanned after ssp. Prior to the ata error
    handling reworks libsas would tend to scan devices in ascending expander
    phy order. Restore this ordering by deferring ssp discovery to a
    DISCE_PROBE event, and keep the probe order consistent with the
    discovery order, not the placement of sata devices.

    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     
  • libsas fails to discover all sata devices in the domain. If a device fails
    negotiation and does not transmit a signature fis the link needs recovery.
    libata already understands how to manage slow to come up links, so treat these
    conditions as ata device attach events for the purposes of creating an
    ata_port. This allows libata to manage retrying link bring up.

    Rediscovery is modified to be careful about checking changes in dev_type. It
    looks like libsas leaks old devices if the sas address changes, but that's a
    fix for another patch.

    Acked-by: Jack Wang
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     
  • If we have a domain with sas and sata devices there may still be sas
    recovery actions to take after peeling off the commands to send to
    libata.

    Reported-by: Andrzej Jakowski
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     

20 Feb, 2012

5 commits

  • Link resets leave ata affiliations intact, so arrange for libsas to make
    an effort to avoid dropping the device due to a slow-to-recover link.
    Towards this end carry out reset in the host workqueue so that it can
    check for ata devices and kick the reset request to libata. Hard
    resets, in contrast, bypass libata since they are meant for associating
    an ata device with another initiator in the domain (tears down
    affiliations).

    Need to add a new transport_sas_phy_reset() since the current
    sas_phy_reset() is a utility function to libsas lldds. They are not
    prepared for it to loop back into eh.

    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     
  • Since sata devices can take several seconds to recover the link on reset
    the 0.5 seconds that libsas currently waits may not be enough. Instead
    if we are rediscovering a phy that was previously attached to a sata
    device let libata handle any resets to encourage the device to transmit
    the initial fis.

    Once sas_ata_hard_reset() and lldds learn how to honor 'deadline' libsas
    should stop encountering phys in an intermediate state, until then this
    will loop until the fis is transmitted or ->attached_sas_addr gets
    cleared, but in the more likely initial discovery case we keep existing
    behavior.

    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     
  • Until we have told the lldd to forget a task a timed out operation can
    return from the hardware at any time. Since completion frees the task
    we need to make sure that no tasks run their normal completion handler
    once eh has decided to manage the task. Similar to
    ata_scsi_cmd_error_handler() freeze completions to let eh judge the
    outcome of the race.

    Task collector mode is problematic because it presents a situation where
    a task can be timed out and aborted before the lldd has even seen it.
    For this case we need to guarantee that a task that an lldd has been
    told to forget does not get queued after the lldd says "never seen it".
    With sas_scsi_timed_out we achieve this with the ->task_queue_flush
    mutex, rather than adding more time.

    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     
  • libata error handling provides for a timeout for link recovery. libsas
    must not rescan for previously known devices in this interval otherwise
    it may remove a device that is simply waiting for its link to recover.
    Let libata-eh make the determination of when the link is stable and
    prevent libsas (host workqueue) from taking action while this
    determination is pending.

    Using a mutex (ha->disco_mutex) to flush and disable revalidation while
    eh is running requires any discovery action that may block on eh be
    moved to its own context outside the lock. Probing ATA devices
    explicitly waits on ata-eh and the cache-flush-io issued during device
    removal may also pend awaiting eh completion. Essentially any rphy
    add/remove activity needs to run outside the lock.

    This adds two new cleanup states for sas_unregister_domain_devices()
    'allocated-but-not-probed', and 'flagged-for-destruction'. In the
    'allocated-but-not-probed' state dev->rphy points to a rphy that is
    known to have not been through a sas_rphy_add() event. At domain
    teardown check if this device is still pending probe and cleanup
    accordingly. Similarly if a device has already been queued for removal
    then sas_unregister_domain_devices has nothing to do.

    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     
  • These are never freed in the nominal path. A domain_device has a
    different lifetime than a sas_rphy we need a dev->rphy independent way
    of identifying sata devices.

    Reviewed-by: Jack Wang
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     

02 Mar, 2011

1 commit

  • The conversion is quite complex given that the libata new error
    handler has to be hooked into the current libsas timeout and error
    handling. The way this is done is to process all the failed commands
    via libsas first, but if they have no underlying sas task (and they're
    on a sata device) assume they are destined for the libata error
    handler and send them accordingly.

    Finally, activate the port recovery of the libata error handler for
    each port known to the host. This is somewhat suboptimal, since that
    port may not need recovering, but given the current architecture of
    the libata error handler, it's the only way; and the spurious
    activation is harmless.

    Signed-off-by: James Bottomley
    Signed-off-by: Jeff Garzik

    James Bottomley
     

08 Apr, 2008

1 commit

  • Two functions in include/scsi/sas_ata.h don't have static inlines
    leading to problems if they're built in:

    On Thu, 2008-04-03 at 14:06 +0200, Toralf Förster wrote:
    > drivers/scsi/mvsas.o: In function `sas_ata_init_host_and_port':
    > mvsas.c:(.text+0x0): multiple definition of `sas_ata_init_host_and_port'
    > drivers/scsi/libsas/built-in.o:(.text+0x37f4): first defined here
    > drivers/scsi/mvsas.o: In function `sas_ata_task_abort':
    > mvsas.c:(.text+0x7): multiple definition of `sas_ata_task_abort'
    > drivers/scsi/libsas/built-in.o:(.text+0x37fb): first defined here
    > make[2]: *** [drivers/scsi/built-in.o] Error 1
    > make[1]: *** [drivers/scsi] Error 2
    > make: *** [drivers] Error 2

    Add the correct static inline modifiers.

    Tested-by: Toralf Förster
    Signed-off-by: James Bottomley

    James Bottomley
     

23 Jul, 2007

1 commit


19 Jul, 2007

2 commits

  • ATA devices need special handling for sas_task_abort. If the ATA command
    came from SCSI, then we merely need to tell SCSI to abort the scsi_cmnd.
    However, internal commands require a bit more work--we need to fill the qc
    with the appropriate error status and complete the command, and eventually
    post_internal will issue the actual ABORT TASK.

    Signed-off-by: James Bottomley

    Darrick J. Wong
     
  • This is a respin of my earlier patch that migrates the ATA support code
    into a separate file. For now, the controversial linking bits have
    been removed per James Bottomley's request for a patch that contains
    only the migration diffs, which means that libsas continues to require
    libata. I intend to address that problem in a separate patch.

    This patch is against the aic94xx-sas-2.6 git tree, and it has been
    sanity tested on my x206m with Seagate SATA and SAS disks without
    uncovering any new problems.

    Signed-off-by: Darrick J. Wong
    Signed-off-by: James Bottomley

    Darrick J. Wong