20 May, 2008

28 commits

  • Signed-off-by: Jeff Garzik

    Jeff Garzik
     
  • The libata-acpi.c code currently accepts hotplug messages from both the
    port and the device. This does not match the behaviour of the bay
    driver, and may result in confusion when two hotplug requests are
    received for the same device. This patch limits the hotplug notification
    to removable ACPI devices, which in turn allows it to use the _STA
    method to determine whether the device has been removed or inserted.
    On removal, devices are marked as detached. On insertion, a hotplug scan
    is started. This should avoid lockups caused by the ata layer attempting
    to scan devices which have been removed. The uevent sending is moved
    outside the spinlock in order to avoid a warning generated by it firing
    when interrupts are disabled.

    Signed-off-by: Matthew Garrett
    Signed-off-by: Jeff Garzik

    Matthew Garrett
     
  • I was hoping ATA_HORKAGE_NODMA | ATA_HORKAGE_SKIP_PM could keep it
    happy but no even this doesn't work under certain configurations and
    it's not like we can do anything useful with the cofig device anyway.
    Replace ATA_HORKAGE_SKIP_PM with ATA_HORKAGE_DISABLE and use it for
    the config device. This makes the device completely ignored by
    libata.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • When 4140 PMP is attached to sil24, NCQ commands to fan out port 1 and
    2 (0 based) often stall if commands are in progress to other ports.
    I've tried a number of things but can't tell what's going on. It
    never happens w/ ahci and reportedly sata_mv which can issue NCQ
    commands to multiple devices simultaneously like sil24 does.

    Disable NCQ for devices behind 4140 PMP for the time being.

    Signed-off-by: Tejun Heo
    Cc: Mark Lord
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • There's no reason to schedule LPM action after probing is complete
    causing another EH iteration. Just schedule it together with probing
    itself.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • PMP notification during reset can make some controllers fail reset
    processing and needs to be turned off during resets. PMP attach and
    full-revalidation path did this via sata_pmp_configure() but the quick
    revalidation wasn't. Move the notification disable code right above
    fan-out port recovery so that it's always turned off.

    This fixes obscure reset failures.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • This timeout was set low because previously PMP register access was
    done via polling and register access timeouts could stack up. This is
    no longer the case. One timeout will make all following accesses fail
    immediately.

    In rare cases both marvell and SIMG PMPs need almost a second. Bump
    it to 3s.

    While at it, rename it to SATA_PMP_RW_TIMEOUT. It's not specific to
    SCR access.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • No reason to get overzealous about recovered comm and data errors.
    Some PHYs habitually sets them w/o no good reason and being draconian
    about these soft error conditions doesn't seem to help anybody.

    If need ever rises, we might need to add soft PHY error condition, say
    AC_ERR_MAYBE_ATA_BUS and use it only to determine whether speed down
    is necessary but I don't think that's very likely to happen. It's far
    more likely we'll get timeouts or fatal transmission errors if
    recovered errors are so prominent that they hamper operation.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • Originally, whole reset processing was done while the port is frozen
    and SError was cleared during @postreset(). This had two race
    conditions. 1: hotplug could occur after reset but before SError is
    cleared and libata won't know about it. 2: hotplug could occur after
    all the reset is complete but before the port is thawed. As all
    events are cleared on thaw, the hotplug event would be lost.

    Commit ac371987a81c61c2efbd6931245cdcaf43baad89 kills the first race
    by clearing SError during link resume but before link onlineness test.
    However, this doesn't fix race #2 and in some cases clearing SError
    after SRST is a good idea.

    This patch solves this problem by cross checking link onlineness with
    classification result after SError is cleared and port is thawed.
    Reset is retried if link is online but all devices attached to the
    link are unknown. As all devices will be revalidated, this one-way
    check is enough to ensure that all devices are detected and
    revalidated reliably.

    This, luckily, also fixes the cases where host controller returns
    bogus status while harddrive is spinning up after hotplug making
    classification run before the device sends the first FIS and thus
    causes misdetection.

    Low level drivers can bypass the logic by setting class explicitly to
    ATA_DEV_NONE if ever necessary (currently none requires this).

    Signed-off-by: Tejun Heo
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • Previously reset freeze/thaw handling lived outside of ata_eh_reset()
    mainly because the original PMP reset code needed the port frozen
    while resetting all the fan-out ports, which is no longer the case.

    This patch moves freeze/thaw handling into ata_eh_reset().
    @prereset() and @postreset() are now called w/o freezing the port
    although @prereset() an be called frozen if the port is frozen prior
    to entering ata_eh_reset().

    This makes code simpler and will help removing hotplug event related
    races.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • Reorganize ata_eh_reset() such that @prereset() is called even when no
    reset method is available and if block is used instead of goto to skip
    actual reset. This makes no reset case behave better (readiness wait)
    and future changes easier.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • The @online out parameter is supposed to set to true iff link is
    online and reset succeeded as advertised in the function description
    and callers are coded expecting that. However, sata_link_reset()
    didn't behave this way on device readiness test failure. Fix it.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • Minor coding-style fixes for sata_promise:
    - remove stray blank lines
    - fix checkpatch.pl errors; warnings about long lines
    remain, but I don't intend to address those at this time
    - remove two inline directives: neither is essential and
    both functions are trivially inlinable anyway by virtue
    of being static and having a single unique call site
    - fix comment in pdc_interrupt(): the bits in PDC_INT_SEQMASK
    denote SEQIDs not tags, the distinction becomes important
    when NCQ gets implemented

    Signed-off-by: Mikael Pettersson
    Signed-off-by: Jeff Garzik

    Mikael Pettersson
     
  • This patch cleans up sata_promise's mmio accesses.

    In sata_promise there are three distinct mmio address spaces:
    1. global registers, offsets from host->iomap[PDC_MMIO_BAR]
    2. per-port ATA registers, offsets from ap->ioaddr.cmd_addr
    3. per-port SATA registers, offsets from ap->ioaddr.scr_addr

    The driver currently often fails to indicate which address space
    a given mmio base pointer refers to, which is a source of bugs
    and confusion (see recent pdc_thaw() irq clearing bug; it's also
    been an obstacle for the pending NCQ extensions).

    To reduce these problems, adopt a coding style where the name of
    a base pointer always indicates which address space it refers to:
    1. global registers: host_mmio
    2. per-port ATA registers: ata_mmio
    3. per-port SATA registers: sata_mmio

    Also rearrange register offset definitions to clearly indicate
    which address space they belong to, and add a symbolic definition
    for the previously hard-coded PHYMODE4 register.

    Signed-off-by: Mikael Pettersson
    Signed-off-by: Jeff Garzik

    Mikael Pettersson
     
  • This patch fixes two bugs in sata_promise's irq status clearing paths:
    1. When clearing the irq status for a specific port, the driver
    read the global SEQMASK register. This is wrong because that
    clears the irq status for _all_ ports.
    2. pdc_thaw() incorrectly added the PDC_INT_SEQMASK host register
    offset to a per-port ata engine base address. This resulted in
    it reading the unrelated PDC_PKT_SUBMIT register, which did not
    have the desired irq status clearing effect.

    In both cases the fix is to read from the port's Command/Status
    register. This also matches what Promise's own driver does.

    Signed-off-by: Mikael Pettersson
    Signed-off-by: Jeff Garzik

    Mikael Pettersson
     
  • Use the kernel-provided clamp_val() macro.

    FIT was always applied to a member of struct ata_timing (unsigned short)
    and two constants. clamp_val will not cast to short anymore.

    Signed-off-by: Harvey Harrison
    Cc: Jeff Garzik
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Jeff Garzik

    Harvey Harrison
     
  • Check for an empty request queue before stopping EDMA after a FBS-NCQ error,
    as per recommendation from the Marvell datasheet.

    This ensures that the EDMA won't suddenly become active again
    just after our subsequent check of the empty/idle bits.

    Also bump DRV_VERSION.

    Signed-off-by: Mark Lord
    Signed-off-by: Jeff Garzik

    Mark Lord
     
  • Part five of simplifying/fixing handling of the main_irq_mask register
    to resolve unexpected interrupt issues observed in 2.6.26-rc*.

    Keep a cached copy of the main_irq_mask so that we don't have
    to stall the CPU to read it on every pass through mv_interrupt.

    This significantly speeds up interrupt handling, both for sata_mv,
    and for any other driver/device sharing the same PCI IRQ line.

    Signed-off-by: Mark Lord
    Signed-off-by: Jeff Garzik

    Mark Lord
     
  • Part four of simplifying/fixing handling of the main_irq_mask register
    to resolve unexpected interrupt issues observed in 2.6.26-rc*.

    Ignore masked IRQs in mv_interrupt().
    This prevents "unexpected device interrupt while idle" messages.

    Signed-off-by: Mark Lord
    Signed-off-by: Jeff Garzik

    Mark Lord
     
  • Part three of simplifying/fixing handling of the main_irq_mask register
    to resolve unexpected interrupt issues observed in 2.6.26-rc*.

    Partially fix a reported bug whereby we sometimes miss seeing drives on
    a port-multiplier, as reported by Gwendal Grignou .

    The problem was that we were receiving unexpected interrupts
    during EH from POLLed commands while accessing port-multiplier registers.
    These unexpected interrupts can be prevented by masking the DONE_IRQ bit
    for the port whenever not operating in EDMA mode.

    Also fix port_stop() to mask all port interrupts.

    Signed-off-by: Mark Lord
    Signed-off-by: Jeff Garzik

    Mark Lord
     
  • Part two of simplifying/fixing handling of the main_irq_mask register
    to resolve unexpected interrupt issues observed in 2.6.26-rc*.

    Consolidate all updates of the host main_irq_mask register
    into a single function. This simplifies maintenance,
    and also prepares the way for caching it (later).

    No functionality changes in this update.

    Signed-off-by: Mark Lord
    Signed-off-by: Jeff Garzik

    Mark Lord
     
  • Part one of simplifying/fixing handling of the main_irq_mask register
    to resolve unexpected interrupt issues observed in 2.6.26-rc*.

    Don't blindly enable port IRQs at host init time.
    Instead, enable only the bits that we want,
    which in this case is simply the PCI_ERR bit.

    The per-port bits can wait until the ports are reset/probed for devices.

    Signed-off-by: Mark Lord
    Signed-off-by: Jeff Garzik

    Mark Lord
     
  • Now that we handle the FIS_IRQ_CAUSE register correctly,
    we can also now handle SATA asynchronous notification events.

    So enable them, but only for the more modern GenIIe chips.
    (older chips have unaddressed errata issues related to this).

    This fixes hot plug/unplug for port-muliplier ports.

    Signed-off-by: Mark Lord
    Signed-off-by: Jeff Garzik

    Mark Lord
     
  • Group all of the flags for GenIIe devices into a common definition,
    to ensure that any updates to them are shared by all GenIIe devices.

    This will help make future maintenance somewhat simpler.

    Signed-off-by: Mark Lord
    Signed-off-by: Jeff Garzik

    Mark Lord
     
  • Fix handling of the FIS_IRQ_CAUSE register in sata_mv.

    This register exists *only* on GenIIe devices, so don't bother
    writing to it on older chips. Also, it has to be read/cleared
    in mv_err_intr() before clearing the main ERR_IRQ_CAUSE register.

    This keeps sata_mv from getting stuck forever on certain error types.

    Signed-off-by: Mark Lord
    Signed-off-by: Jeff Garzik

    Mark Lord
     
  • Always request a softreset after hardreset succeeds.

    This fixes a regression reported by Martin Michlmayr .

    Signed-off-by: Mark Lord
    Signed-off-by: Jeff Garzik

    Mark Lord
     
  • Remove an explicit memset(.., 0, ...) to a variable allocated with
    kzalloc (i.e. 'info').

    Signed-off-by: Christophe Jaillet
    Acked-by: Haavard Skinnemoen
    Signed-off-by: Jeff Garzik

    Christophe Jaillet
     
  • Set ATAPI host state machine to control IDE device terminate sequence.
    Some IDE harddisk may assert terminate sequence in the middle of a
    formal DMA transaction and resume later. Bit DETECT_TERM in ATAPI_CTRL
    register determines whether the ATAPI host state machine or the kernel
    driver should take care of this case.

    Signed-off-by: Sonic Zhang
    Signed-off-by: Bryan Wu
    Signed-off-by: Jeff Garzik

    Sonic Zhang
     

06 May, 2008

12 commits

  • A couple of distributions (Fedora, Ubuntu) were having weird problems with the
    ATI IXP series PATA controllers being reported as simplex. At the heart of
    the problem is that both distros ignored the recommendations to load pata_acpi
    and ata_generic *AFTER* specific host drivers.

    The underlying cause however is that if you D3 and then D0 an ATI IXP it
    helpfully throws away some configuration and won't let you rewrite it.

    Add checks to ata_generic and pata_acpi to pin ATIIXP devices. Possibly the
    real answer here is to quirk them and pin them, but right now we can't do that
    before they've been pcim_enable()'d by a driver.

    I'm indebted to David Gero for this. His bug report not only reported the
    problem but identified the cause correctly and he had tested the right values
    to prove what was going on

    [If you backport this for 2.6.24 you will need to pull in the 2.6.25
    removal of the bogus WARN_ON() in pcim_enagle]

    Signed-off-by: Alan Cox
    Tested-by: David Gero
    Signed-off-by: Andrew Morton
    Signed-off-by: Jeff Garzik

    Alan Cox
     
  • sata_inic162x is now ready for production use. Bump the version,
    explain what's working and what's not and drop EXPERIMENTAL.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • When attached to cardbus, mmio region is at BAR 1. Other than that,
    everything else is the same. Add support for it.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • sata_inic162x now doesn't use any SFF features. Remove all SFF
    related stuff.

    * Mask unsolicited ATA interrupts. This removes our primary source of
    spurious interrupts and spurious interrupt handling can be tightened
    up. There's no need to clear ATA interrupts by reading status
    register either.

    * Don't dance with IDMA_CTL_ATA_NIEN and simplify accesses to
    IDMA_CTL.

    * Inherit from sata_port_ops instead of ata_sff_port_ops.

    * Don't initialize or use ioaddr. There's no need to map BAR0-4
    anymore.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • Use IDMA for ATAPI commands. Write and some misc commands time out
    when executed using ATAPI_PROT_DMA but ATAPI_PROT_PIO works fine. As
    PIO is driven by DMA too, it doesn't make any noticeable difference
    for native SATA devices. inic_check_atapi_dma() is implemented to
    force PIO for those ATAPI commands.

    After this change, sata_inic162x issues all commands using IDMA.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • Use IDMA for PIO and non-data commands. This allows sata_inic162x to
    safely drive LBA48 devices. Kill inic_dev_config() which contains
    code to reject LBA48 devices.

    With this change, status checking in inic_qc_issue() to avoid hard
    lock up after hotplug can go away too.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • sata_inic162x doesn't use BMDMA anymore. Kill bmdma related stuff.

    * prdctl manipulation

    * port IRQ mask manipulation

    * inherit ATA_BASE_SHT instead of ATA_BMDMA_SHT

    * BMDMA methods

    Signed-off-by: Tejun Heo
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • The modified driver on initio site has enough clue on how to use IDMA.
    Use IDMA for ATA_PROT_DMA.

    * LBA48 now works as long as it uses DMA (LBA48 devices still aren't
    allowed as it can destroy data if PIO is used for any reason).

    * No need to mask IRQs for read DMAs as IDMA_DONE is properly raised
    after transfer to memory is actually completed. There will be some
    spurious interrupts but host_intr will handle it correctly and
    manipulating port IRQ mask interacts badly with the other port for
    some reason, so command type dependent port IRQ masking is not used
    anymore.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • inic162x can't reliably read back TF or at least we don't know how to
    do it yet. The only values which seem reliable are status and error.
    This patch updates access to TF.

    * implement inic_tf_read() which reads the TF area in mmio area

    * implement custom inic_qc_fill_rtf() which only returns true if
    status indicates device error. it'll be returning bogus addresses
    for device errors but it'll be able to report why it failed at
    least.

    * implement custom inic_check_ready() and use ata_wait_after_reset()
    instead of the SFF version.

    * use inic_tf_read() for classification.

    This is not perfect but it fixes hotplug detection failure and at
    least makes the driver report 0's instead of random garbages while
    reporting valid status and error for device errors.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • * add a bunch of constants, most are from the datasheet, a few
    undocumented ones are from initio's modified driver

    * HCTL_PWRDWN is bit 12 not 13

    This is in preparation of further inic162x updates.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • * use larger indents for structure member definitions

    * kill unused variable @addr in inic_scr_write()

    * kill unnecessary flushes in inic_freeze/thaw()

    * kill buggy explicit kfree() on devres managed port private data

    This is in preparation of further inic162x updates.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • Some tidying as suggested by Grant Grundler.

    Nuke local bit-counting function from sata_mv in favour of using hweight16().
    Also add a short explanation for the 15msec timeout used when waiting for empty/idle.

    Signed-off-by: Mark Lord
    Signed-off-by: Jeff Garzik

    Mark Lord