31 Oct, 2011

11 commits

  • Fixes bug where max_concurr_spinup oem parameter should be
    overriden by max_concurr_spinup user parameter. Override should
    happen only when max_concurr_spinup user parameter is specified
    in command line (greater than 0). Also this fix shortens variables
    representing max_conxurr_spinup for oem and user parameters.

    Signed-off-by: Andrzej Jakowski
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Andrzej Jakowski
     
  • The initial bcn filtering implementation was validated on a kernel
    baseline that predated the switch to new libata error handling. Also,
    prior to that conversion we borrowed the mvsas MVS_DEV_EH approach to
    prevent the unwanted extra ap->ops->phy_reset(ap) that occurred in the
    ata_bus_probe() path.

    After the conversion to new libata eh resets at discovery are more
    frequent and get filtered prematurely by IDEV_EH. The result is that
    our bcn filtering has been blocked from running and at discovery and it
    appears to stall discovery completion to the point of triggering hung
    task timeouts. So, revert the implementation for now. When it returns
    it will go into libsas proper.

    The domain rediscovery that takes place due to ->lldd_I_T_nexus_reset()
    events should now be properly waited for by the ata_port_wait_eh() call
    in ata_port_probe(). So the hard coded delay in the isci
    ->lldd_I_T_nexus_reset() and other libsas drivers should help debounce
    the libsas thread from seeing temporary device removals.

    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     
  • A hard reset can timeout before or after the last phy in the
    port goes away. If after, then notify the OS that the last
    phy has failed.

    The recovery for the failed hard reset has been removed.
    This recovery code was unecessary in that the link would
    recover from the failure normally by a new link reset sequence
    or hotplug of the remote device.

    Signed-off-by: Jeff Skirvin
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Jeff Skirvin
     
  • The lldd does not need to look at or manage the pending device
    reset bit in pending sas_tasks.

    Signed-off-by: Jeff Skirvin
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Jeff Skirvin
     
  • Use the existing IREQ_TMF flag as a request type indicator.

    Signed-off-by: Jeff Skirvin
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Jeff Skirvin
     
  • libsas uses the LLDD abort task interface to handle I/O timeouts
    in the SATA/STP and SMP discovery paths, so this change will terminate
    STP/SMP requests. Also, if the device is gone, the lldd will prevent
    libsas from further escalations in the error handler.

    Signed-off-by: Jeff Skirvin
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Jeff Skirvin
     
  • libsas will cleanup pending sas_tasks after error handler
    path functions are called; do not call task_done callbacks.

    Signed-off-by: Jeff Skirvin
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Jeff Skirvin
     
  • In the case where "task" requests timeout (note that this class of
    requests can also include SATA/STP soft reset FIS transmissions),
    handle the case where the task was being managed by some call to
    terminate the task request by completing both the tmf and the aborting
    process.

    Signed-off-by: Jeff Skirvin
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Jeff Skirvin
     
  • Make sure terminated requests and completed task tags are freed.

    Signed-off-by: Jeff Skirvin
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Jeff Skirvin
     
  • In the case where an I/O fails to start in isci_request_execute,
    only allow retries if the device is not already gone.

    Signed-off-by: Jeff Skirvin
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Jeff Skirvin
     
  • The LLDD needs to obtain a reference to the device through the request
    itself and not through the domain_device, because the
    domain_device.lldd_dev is set to NULL early in the lldd_dev_gone call.
    This relies on the fact that the isci_remote_device object is keeping a
    seperate reference count of outstanding requests. TODO: unify the
    request count tracking with the isci_remote_device kref.

    The failure signature of this condition looks like the following
    log, where the important bits are the call to lldd_dev_gone followed
    by a crash in isci_terminate_request_core:

    [ 229.151541] isci 0000:0b:00.0: isci_remote_device_gone: domain_device = ffff8801492d4800, isci_device = ffff880143c657d0, isci_port = ffff880143c63658
    [ 229.166007] isci 0000:0b:00.0: isci_remote_device_stop: isci_device = ffff880143c657d0
    [ 229.175317] isci 0000:0b:00.0: isci_terminate_pending_requests: idev=ffff880143c657d0 request=ffff88014741f000; task=ffff8801470f46c0 old_state=2
    [ 229.189702] isci 0000:0b:00.0: isci_terminate_request_core: device = ffff880143c657d0; request = ffff88014741f000
    [ 229.201339] isci 0000:0b:00.0: isci_terminate_request_core: before completion wait (ffff88014741f000/ffff880149715ad0)
    [ 229.213414] isci 0000:0b:00.0: sci_controller_process_completions: completion queue entry:0x8000a0e9
    [ 229.214401] BUG: unable to handle kernel NULL pointer dereference at 0000000000000228
    [ 229.214401] IP:jdskirvi-testlbo [] sci_request_completed_state_enter+0x50/0xafb [isci]
    [ 229.214401] PGD 13d19e067 PUD 13d104067 PMD 0
    [ 229.214401] Oops: 0000 [#1] SMP
    [ 229.214401] CPU 0 x kernel: [ 226
    [ 229.214401] Modules linked in: ipv6 dm_multipath uinput nouveau snd_hda_codec_realtek snd_hda_intel ttm drm_kms_helper drm snd_hda_codec snd_hwdep snd_pcm snd_timer i2c_algo_bit isci snd libsas ioatdma mxm_wmi iTCO_wdt soundcore snd_page_alloc scsi_transport_sas iTCO_vendor_support wmi dca video i2c_i801 i2c_core [last unloaded: speedstep_lib]
    [ 229.214401]
    [ 229.214401] Pid: 5, comm: kworker/u:0 Not tainted 3.0.0-isci-11.7.29+ #30.353196] Buffer Intel Corporation Stoakley/Pearlcity Workstation
    [ 229.214401] RIP: 0010:[] I/O error on dev [] sci_request_completed_state_enter+0x50/0xafb [isci]
    [ 229.214401] RSP: 0018:ffff88014fc03d20 EFLAGS: 00010046
    [ 229.214401] RAX: 0000000000000000 RBX: ffff88014741f000 RCX: 0000000000000000
    [ 229.214401] RDX: ffffffffa00b2c90 RSI: 0000000000000017 RDI: ffff88014741f0a0
    [ 229.214401] RBP: ffff88014fc03d90 R08: 0000000000000018 R09: 0000000000000000
    [ 229.214401] R10: 0000000000000000 R11: ffffffff81a17d98 R12: 000000000000001d
    [ 229.214401] R13: ffff8801470f46c0 R14: 0000000000000000 R15: 0000000000008000
    [ 229.214401] FS: 0000000000000000(0000) GS:ffff88014fc00000(0000) knlGS:0000000000000000
    [ 229.214401] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    [ 229.214401] CR2: 0000000000000228 CR3: 000000013ceaa000 CR4: 00000000000406f0
    [ 229.214401] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 229.214401] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    [ 229.214401] Process kworker/u:0 (pid: 5, threadinfo ffff880149714000, task ffff880149718000)
    [ 229.214401] Call Trace:
    [ 229.214401]
    [ 229.214401] [] sci_change_state+0x4a/0x4f [isci]
    [ 229.214401] [] sci_io_request_tc_completion+0x79c/0x7a0 [isci]
    [ 229.214401] [] sci_controller_process_completions+0x14f/0x396 [isci]
    [ 229.214401] [] ? spin_lock_irq+0xe/0x10 [isci]
    [ 229.214401] [] isci_host_completion_routine+0x71/0x2be [isci]
    [ 229.214401] [] ? mark_held_locks+0x52/0x70
    [ 229.214401] [] tasklet_action+0x90/0xf1
    [ 229.214401] [] __do_softirq+0xe5/0x1bf
    [ 229.214401] [] ? hrtimer_interrupt+0x129/0x1bb
    [ 229.214401] [] call_softirq+0x1c/0x30
    [ 229.214401] [] do_softirq+0x4b/0xa3
    [ 229.214401] [] irq_exit+0x53/0xb4
    [ 229.214401] [] smp_apic_timer_interrupt+0x83/0x91
    [ 229.214401] [] apic_timer_interrupt+0x13/0x20
    [ 229.214401]
    [ 229.214401] [] ? retint_restore_args+0x13/0x13
    [ 229.214401] [] ? trace_hardirqs_off+0xd/0xf
    [ 229.214401] [] ? vprintk+0x40b/0x452
    [ 229.214401] [] printk+0x41/0x47
    [ 229.214401] [] __dev_printk+0x78/0x7a
    [ 229.214401] [] dev_printk+0x45/0x47
    [ 229.214401] [] isci_terminate_request_core+0x15d/0x317 [isci]
    [ 229.214401] [] isci_terminate_pending_requests+0x1a4/0x204 [isci]
    [ 229.214401] [] ? sas_phye_oob_error+0xc3/0xc3 [libsas]
    [ 229.214401] [] isci_remote_device_nuke_requests+0xa6/0xff [isci]
    [ 229.214401] [] isci_remote_device_stop+0x7c/0x166 [isci]
    [ 229.214401] [] ? sas_phye_oob_error+0xc3/0xc3 [libsas]
    [ 229.214401] [] isci_remote_device_gone+0x76/0x7e [isci]
    [ 229.214401] [] sas_notify_lldd_dev_gone+0x34/0x36 [libsas]
    [ 229.214401] [] sas_unregister_dev+0x57/0x9c [libsas]
    [ 229.214401] [] sas_unregister_domain_devices+0x36/0x65 [libsas]
    [ 229.214401] [] sas_deform_port+0x72/0x1ac [libsas]
    [ 229.214401] [] ? sas_phye_oob_error+0xc3/0xc3 [libsas]
    [ 229.214401] [] sas_phye_loss_of_signal+0x3e/0x42 [libsas]

    Signed-off-by: Jeff Skirvin
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Jeff Skirvin
     

29 Oct, 2011

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (204 commits)
    [SCSI] qla4xxx: export address/port of connection (fix udev disk names)
    [SCSI] ipr: Fix BUG on adapter dump timeout
    [SCSI] megaraid_sas: Fix instance access in megasas_reset_timer
    [SCSI] hpsa: change confusing message to be more clear
    [SCSI] iscsi class: fix vlan configuration
    [SCSI] qla4xxx: fix data alignment and use nl helpers
    [SCSI] iscsi class: fix link local mispelling
    [SCSI] iscsi class: Replace iscsi_get_next_target_id with IDA
    [SCSI] aacraid: use lower snprintf() limit
    [SCSI] lpfc 8.3.27: Change driver version to 8.3.27
    [SCSI] lpfc 8.3.27: T10 additions for SLI4
    [SCSI] lpfc 8.3.27: Fix queue allocation failure recovery
    [SCSI] lpfc 8.3.27: Change algorithm for getting physical port name
    [SCSI] lpfc 8.3.27: Changed worst case mailbox timeout
    [SCSI] lpfc 8.3.27: Miscellanous logic and interface fixes
    [SCSI] megaraid_sas: Changelog and version update
    [SCSI] megaraid_sas: Add driver workaround for PERC5/1068 kdump kernel panic
    [SCSI] megaraid_sas: Add multiple MSI-X vector/multiple reply queue support
    [SCSI] megaraid_sas: Add support for MegaRAID 9360/9380 12GB/s controllers
    [SCSI] megaraid_sas: Clear FUSION_IN_RESET before enabling interrupts
    ...

    Linus Torvalds
     

03 Oct, 2011

7 commits

  • Allow the sas-transport-class to update events for local phys via a new
    PHY_FUNC_GET_EVENTS command to ->lldd_control_phy(). Fixup drivers that
    are not prepared for new enum phy_func values, and unify
    ->lldd_control_phy() error codes.

    These are the SAS defined phy events that are reported in a
    smp-report-phy-error-log command:
    * /sys/class/sas_phy//invalid_dword_count
    * /sys/class/sas_phy//running_disparity_error_count
    * /sys/class/sas_phy//loss_of_dword_sync_count
    * /sys/class/sas_phy//phy_reset_problem_count

    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     
  • Fixes a bug where any phy removed from the port set the port
    state to "stopping" - do this only when the last phy removed
    from the port.

    Signed-off-by: Jeff Skirvin
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Jeff Skirvin
     
  • DONE_CRC_ERR is not a RNC suspension condition, so do not change the
    state to expect the incoming suspension notification.

    Signed-off-by: Jeff Skirvin
    [djbw: dropped DONE_CMD_LL_R_ERR change]
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Jeff Skirvin
     
  • Since libsas has it's own means to escalate SATA/STP device error
    handling depending on task status codes, return all SATA/STP I/O
    on the normal path.

    i.e. skip sas_task_abort() and let sas_ata_task_done() disposition the
    qc. Longer term we want to audit non-essential calls to
    sas_task_abort().

    Signed-off-by: Jeff Skirvin
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Jeff Skirvin
     
  • Based on original implementation from Jiangbi Liu and Maciej Trela.

    ATAPI transfers happen in two-to-three stages. The two stage atapi
    commands are those that include a dma data transfer. The data transfer
    portion of these operations is handled by the hardware packet-dma
    acceleration. The three-stage commands do not have a data transfer and
    are handled without hardware assistance in raw frame mode.

    stage1: transmit host-to-device fis to notify the device of an incoming
    atapi cdb. Upon reception of the pio-setup-fis repost the task_context
    to perform the dma transfer of the cdb+data (go to stage3), or repost
    the task_context to transmit the cdb as a raw frame (go to stage 2).

    stage2: wait for hardware notification of the cdb transmission and then
    go to stage 3.

    stage3: wait for the arrival of the terminating device-to-host fis and
    terminate the command.

    To keep the implementation simple we only support ATAPI packet-dma
    protocol (for commands with data) to avoid needing to handle the data
    transfer manually (like we do for SATA-PIO). This may affect
    compatibility for a small number of devices (see
    ATA_HORKAGE_ATAPI_MOD16_DMA).

    If the data-transfer underruns, or encounters an error the
    device-to-host fis is expected to arrive in the unsolicited frame queue
    to pass to libata for disposition. However, in the DONE_UNEXP_FIS (data
    underrun) case it appears we need to craft a response. In the
    DONE_REG_ERR case we do receive the UF and propagate it to libsas.

    Signed-off-by: Maciej Trela
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     
  • Needed to jump to scic_lock unlock.

    Also spotted by coccicheck.

    Signed-off-by: Jeff Skirvin
    Cc:
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Jeff Skirvin
     
  • Kill the local smp response buffer.

    Besides being unnecessary, it is too small (currently truncates
    responses to 60 bytes). The mid-layer will have already allocated a
    sufficiently sized buffer, just kmap and copy into it directly.

    Cc:
    Reported-by: Derick Marks
    Tested-by: Derick Marks
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     

22 Sep, 2011

2 commits

  • Basic support to initialize the gpio unit, accept an incomming
    SAS_GPIO_REG_TX_GP bitstream, and translate it to the ODx.n fields in
    the hardware registers. If register indexes outside the supported range
    are specified in the SMP frame we simply accept the write and return how
    many registers (SFF-8485) were written (libsas reports this as residue
    in the request).

    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     
  • output_data_select registers are off by one u32

    delete the macros we will never use.

    Reported-by: Artur Wojcik
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     

15 Sep, 2011

2 commits


27 Aug, 2011

1 commit


24 Aug, 2011

8 commits

  • Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     
  • Hardware only increments the put pointer on event types >= 4. Do not
    increment the get pointer for event type 3.

    Reported-by: Kapil Karkra
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     
  • Hardware allows both an outstanding number commands and a timeout value
    (whichever occurs first) as a gate to the next interrupt generation. This
    scheme at completion time looks at the remaining number of outstanding tasks
    and sets the timeout to maximize small transaction operation. If transactions
    are large (take more than a few 10s of microseconds to complete) then
    performance is not interrupt processing bound, so the small timeouts this
    scheme generates are overridden by the time it takes for a completion to
    arrive.

    Tested-by: Dave Jiang
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     
  • Instead of immediately completing any request that has a second
    termination call made on it, wait for the TC done/abort HW event.

    Signed-off-by: Jeff Skirvin
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Jeff Skirvin
     
  • Adding API update for adding isci_id entry scsi_host sysfs entry.
    Also fixing up the sysfs registration to the scsi_host template

    Signed-off-by: Dave Jiang
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dave Jiang
     
  • Need the following workaround in the driver for interoperability with
    the older Intel SSD drives and any other SATA drive that may exhibit the
    same behavior. This is a corner case where SCU speed is limited to
    either 3G or 1.5G and the drive has a period of DC idle when it switches
    speed during SATA speed negotiation. Workaround :change PHYTOV[31:24]
    from 0x36 to 0x3B.

    Signed-off-by: Marcin Tomczak
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Marcin Tomczak
     
  • The unsolicited frame control infrastructure requires a table of dma
    addresses for the hardware to lookup the frame buffer location by an
    index. The hardware expects the elements of this table to be 64-bit
    quantities, so we cannot reference these elements as dma_addr_t. All
    unsolicited frame protocols are affected, particularly SATA-PIO and SMP
    which prevented direct-attached SATA drives and expander-attached drives
    to not be discovered.

    Cc:
    Reported-by: Jacek Danecki
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     
  • A bug (likely copy/paste) that has been carried from the original
    implementation. The unsolicited frame handling structure returns the
    d2h fis in the isci_request.stp.rsp buffer.

    Cc:
    Signed-off-by: Dan Williams
    Signed-off-by: James Bottomley

    Dan Williams
     

04 Jul, 2011

1 commit


03 Jul, 2011

7 commits