26 Oct, 2010

40 commits

  • This patch adds the Scatter-Gather (sg) API to libosd.
    Scatter-gather enables a write/read of multiple none-contiguous
    areas of an object, in a single call. The extents may overlap
    and/or be in any order.

    The Scatter-Gather list is sent to the target in what is called
    a "cdb continuation segment". This is yet another possible segment
    in the osd-out-buffer. It is unlike all other segments in that it
    sits before the actual "data" segment (which until now was always
    first), and that it is signed by itself and not part of the data
    buffer. This is because the cdb-continuation-segment is considered
    a spill-over of the CDB data, and is therefor signed under
    OSD_SEC_CAPKEY and higher.

    TODO: A new osd_finalize_request_ex version should be supplied so
    the @caps received on the network also contains a size parameter
    and can be spilled over into the "cdb continuation segment".

    Thanks to John Chandy for the original
    code, and investigations. And the implementation of SG support
    in the osd-target.

    Original-coded-by: John Chandy
    Signed-off-by: Boaz Harrosh
    Signed-off-by: James Bottomley

    Boaz Harrosh
     
  • At osd_end_request first free the request that might
    point to pages, then free these pages. In reverse order
    of allocation. For now it's just anal neatness. When we'll
    use mempools It'll also pay in performance.

    Signed-off-by: Boaz Harrosh
    Signed-off-by: James Bottomley

    Boaz Harrosh
     
  • The _osd_req_finalize_attr_page was off by a mile, when trying to
    append the enc_get_attr segment instead of the proper set_attr segment.

    Also properly support when we don't have any attribute to set while
    getting a full page. And when clearing an attribute by setting it's
    size to zero.

    Reported-by: John Chandy
    Signed-off-by: Boaz Harrosh
    Signed-off-by: James Bottomley

    Boaz Harrosh
     
  • Signed-off-by: Alex Iannicelli
    Signed-off-by: James Smart
    Signed-off-by: James Bottomley

    James Smart
     
  • - Add new WQE fields as defined by new SLI interface to support new hardware.

    Signed-off-by: Alex Iannicelli
    Signed-off-by: James Smart
    Signed-off-by: James Bottomley

    James Smart
     
  • Fix critical errors

    - Update send_scsi_event to validate pnode pointer active before copying
    the wwpn information.
    - Add a message, mailbox_idle, and unlock before failing SECURITY_MGMT
    or AUTH_PORT mailbox commands
    - Prevent spin_lock_irqsave from being called twice in a row.

    Signed-off-by: Alex Iannicelli
    Signed-off-by: James Smart
    Signed-off-by: James Bottomley

    James Smart
     
  • Adapter Shutdown and Unregistration cleanup

    - Correct the logic around hba shutdown. Prior to final reset, the
    driver must wait for all XRIs to return from the adapter. Added logic
    to poll, progressively slowing the poll rate as delay gets longer.
    - Correct behavior around the rsvd1 field in UNREG_RPI_ALL mailbox
    completion and final rpi cleanup.
    - Updated logic to move pending VPI registrations to their completion
    in cases where a CVL may be received while registration in progress.
    - Added unreg all rpi mailbox command before unreg vpi.

    Signed-off-by: Alex Iannicelli
    Signed-off-by: James Smart
    Signed-off-by: James Bottomley

    James Smart
     
  • Added driver logic to detect the last devloss timeout of remote nodes which
    was still in use of FCF. At that point, the driver should set the last
    in-use remote node devloss timeout flag if it was not already set and should
    perform proper action on the in-use FCF and recover of FCF from firmware,
    depending on the state the driver's FIP engine is in.

    Find eligible FCF through FCF table rescan or the next new FCF event when
    FCF table rescan turned out empty eligible FCF, and the successful flogi
    into an FCF shall clear the HBA_DEVLOSS_TMO flag, indicating the successful
    recovery from devloss timeout.

    [jejb: add delay.h include to lpfc_hbadisc.c to fix ppc compile]
    Signed-off-by: Alex Iannicelli
    Signed-off-by: James Smart
    Signed-off-by: James Bottomley

    James Smart
     
  • Add support of received ELS commands

    - Add support for received RLS ELS command
    - Add support for received ECHO ELS command
    - Add support for received RTV ELS command

    Signed-off-by: Alex Iannicelli
    Signed-off-by: James Smart
    Signed-off-by: James Bottomley

    James Smart
     
  • FC/FCoE Discovery fixes:

    - Call the lpfc_drain_txq only for SLI4 hba
    - In lpfc_cmpl_els_fdisc, fix code path that does not free IOCB.
    - Treated firmware matching FCF property with different index as error
    - Propagate error returns from lpfc_issue_els_flogi()
    - Refactored lpfc_unregister_unused_fcf() to create a post
    lpfc_dev_loss_tmo handler call for SLI-4 devices. Allows checking of
    fcf after last ndlp released so that fcf can be released if no longer
    in use.
    - Replaced individual FCF_XXXX_DISC flag clearing in lieu of aggregate
    FCF_DISCOVERY flag upon succesful completion of flogi.
    - Correct setting of altBbCredit value in sparams to correct issue with
    logins with remote loop-based devices.

    Signed-off-by: Alex Iannicelli
    Signed-off-by: James Smart
    Signed-off-by: James Bottomley

    James Smart
     
  • There was an addition to the hardware roadmap that includes a new adapter.
    This patch adds the new definitions for the adapter.

    Signed-off-by: Wayne Boyer
    Acked-by: Brian King
    Signed-off-by: James Bottomley

    Wayne Boyer
     
  • This patch addresses the comments from Randy Dunlap (Randy.Dunlap@oracle.com)
    regarding comment blocks that begining with "/**". bfa driver comments
    currently do not follow kernel-doc convention, we hence replace all
    /** with /* and **/ with */.

    Signed-off-by: Jing Huang
    Signed-off-by: James Bottomley

    Jing Huang
     
  • This patch addresses the comments from Randy Dunlap (Randy.Dunlap@oracle.com)
    regarding comment blocks that begining with "/**". bfa driver comments
    currently do not follow kernel-doc convention, we hence replace all
    /** with /* and **/ with */.

    Signed-off-by: Jing Huang
    Signed-off-by: James Bottomley

    Jing Huang
     
  • Fix compile warning for frame size over 1024 in gcc 4.4.

    Signed-off-by: Jing Huang
    Signed-off-by: James Bottomley

    Jing Huang
     
  • This patch replaces register access functions and macros with the the ones
    provided by linux.

    Signed-off-by: Jing Huang
    Signed-off-by: James Bottomley

    Jing Huang
     
  • Signed-off-by: Jing Huang
    Signed-off-by: James Bottomley

    Jing Huang
     
  • This patch removes os wrapper and unused functions.
    bfa_os_assign(), bfa_os_memset(), bfa_os_memcpy(), bfa_os_udelay()
    bfa_os_vsprintf(), bfa_os_snprintf(), and bfa_os_get_clock() are replaced with
    direct assignment or native linux functions. Some unused functions related to VF
    (Vitual fabric) are also removed.

    Signed-off-by: Jing Huang
    Signed-off-by: James Bottomley

    Jing Huang
     
  • Signed-off-by: Vijay Chauhan
    Acked-by: Chandra Seetharaman
    Signed-off-by: James Bottomley

    Chauhan, Vijay
     
  • Ignore active open reply with status negative advice. This is an
    informational message.

    Signed-off-by: Karen Xie
    Reviewed-by: Mike Christie
    Signed-off-by: James Bottomley

    Karen Xie
     
  • Signed-off-by: Giridhar Malavali
    Signed-off-by: Madhuranath Iyengar
    Signed-off-by: James Bottomley

    Giridhar Malavali
     
  • This patch fixes an issue which causes the firmware to fail with a
    'PRLI failed' status code (iop1 = 405). This status triggers the
    driver to fall into an incorrect code-path which does not attempt
    a login retry.

    Signed-off-by: Andrew Vasquez
    Signed-off-by: Madhuranath Iyengar
    Signed-off-by: James Bottomley

    Andrew Vasquez
     
  • This patch fixes a regression introduced by commit
    083a469db4ecf3b286a96b5b722c37fc1affe0be

    qla2xxx_eh_wait_on_command() is waiting for an srb to
    complete, which will never happen as the routine took
    a reference to the srb previously and will only drop it
    after this function. So every command abort will fail.

    Signed-off-by: Mike Christie
    Signed-off-by: Giridhar Malavali
    Signed-off-by: Madhuranath Iyengar
    Signed-off-by: James Bottomley

    Mike Christie
     
  • This patch adds a shutdown handler to qla2xxx driver to make sure that all
    DMA and firmware activities are stopped, and any associated driver resources
    are released. The need for this handler arose when executing kexec in specific
    environments caused the data of the 2nd kernel to be corrupted, due to DMA
    activities.

    Signed-off-by: Madhuranath Iyengar
    Signed-off-by: James Bottomley

    Madhuranath Iyengar
     
  • Commit feafb7b1714cf599a6d0fed45801ab3f66046cbd neglected to initialize
    the spinlock.

    Signed-off-by: Andrew Vasquez
    Signed-off-by: Madhuranath Iyengar
    Signed-off-by: James Bottomley

    Andrew Vasquez
     
  • This patch cleans up any printk or debug tracing of the the
    serial_number field in the qla2xxx driver.

    Signed-off-by: Madhuranath Iyengar
    Signed-off-by: James Bottomley

    Madhuranath Iyengar
     
  • Signed-off-by: Harish Zunjarrao
    Signed-off-by: Madhuranath Iyengar
    Signed-off-by: James Bottomley

    Harish Zunjarrao
     
  • Currently when we receive a CS_RESET as a response for a SCSI command the
    driver will return DID_TRANSPORT_DISRUPTED back to the SCSI mid-layer. There
    are certain circumstances where this could cause the mid-layer to exhaust all of
    its retries if the FC port goes away for a short time. This will result in
    commands being prematurly failed. Moving the CS_RESET return code to be
    grouped with other link level events will cause the FC transport layer to block
    that target's queue thus preventing the premature exhaustion of retries.

    Signed-off-by: Chad Dupuis
    Signed-off-by: Madhuranath Iyengar
    Signed-off-by: James Bottomley

    Chad Dupuis
     
  • Using del_timer_sync() in the qla2x00_ctx_sp_free() function may cause a kernel
    panic as it is not interrupt context safe and qla2x00_ctx_sp_free() may be
    called from a softirq context. Changing the call from del_timer_sync() to
    del_timer() will make the function interrupt context safe.

    Signed-off-by: Chad Dupuis
    Signed-off-by: Madhuranath Iyengar
    Signed-off-by: James Bottomley

    Chad Dupuis
     
  • Add the module parameter ql2xgffidenable to disable/enable the use of the
    GFF_ID name server command to prevent non FCP SCSI devices from being added to
    the driver's internal fc_port database.

    Signed-off-by: Chad Dupuis
    Signed-off-by: Madhuranath Iyengar
    Signed-off-by: James Bottomley

    Chad Dupuis
     
  • This patch removes the use of the port down retry counter as a mechanism to
    update a fcport state. The internal driver counter is a residual carry-over
    from pre-FC-transport aware driver inteaction. The ql2xport_down_retry module
    parameter and NVRAM set ha->port_down_retry_count remain in order to seed the
    fc-host's default dev-loss-tmo.

    Signed-off-by: Chad Dupuis
    Signed-off-by: Madhuranath Iyengar
    Signed-off-by: James Bottomley

    Chad Dupuis
     
  • IRQs are already disabled here so we don't need to disable them again.
    But more importantly, the spin_lock_irqsave() overwrites "flags" and
    that breaks things when we want to re-enable the IRQs when we call
    spin_unlock_irqrestore(&ha->hardware_lock, flags);

    Signed-off-by: Dan Carpenter
    Signed-off-by: Madhuranath Iyengar
    Signed-off-by: James Bottomley

    Dan Carpenter
     
  • An sr device that reports sense data with SK/ASC/ASCQ of 2/4/2 (Not ready,
    Logical unit not ready, Initializing command required) will be handled
    in sr_drive_status as (2/4/!1) and assumed to be a 'format in progress'
    which returns CDS_DISC_OK. The drive will not be made ready in this case.

    Prior to 210ba1d1724f5c4ed87a2ab1a21ca861a915f734 sr_drive_status would
    have returned CDS_TRAY_OPEN and this results in an START_STOP_UNIT to
    close the tray, which resolves the initialization requirement.

    This patch adds handling for SK/ASC/ASCQ of 2/4/2 where it will return
    CDS_TRAY_OPEN as a means of triggering a START_STOP_UNIT.

    This issue is seen on the IBM POWER platform when using a file-backed,
    virtual optical device. The device does not support media queries
    through the Get Event Status Notification command which could otherwise
    trigger a START_STOP_UNIT call to close an open tray.

    Signed-off-by: Robert Jennings
    Signed-off-by: James Bottomley

    Robert Jennings
     
  • A previous patch attempted to validate the destination
    MAC address of a FCoE frame by checking that MAC
    address against the received port's MAC address. The
    implementation seems fine on the surface, but any
    VN_Ports added using the NPIV feature will have their
    own MAC addresses and these MACs were not being checked,
    which prevented any NPIV VN_Ports from receiving frames.

    In other words, the following patch has broken NPIV.

    519e5135e2537c9dbc1cbcc0891b0a936ff5dcd2
    [SCSI] fcoe: adds src and dest mac address
    checking for fcoe frames

    Part of the offending patch is correct, but the part
    that broke NPIV was attempting to satisfy FC-BB-5
    section D.5, 2.1-

    (discard frames that) "contain a destination MAC
    address/destination N_Port_ID pair that was not
    assigned by an FCF to one of the VN_Ports on the ENode"

    The language does _not_ say to compare the destination
    FC-MAP/destination N_Port_ID, but instead to compare
    the destination MAC address/destination N_Port_ID.

    >From the FC-BB-5 specification,

    "A properly formed FPMA is one in which the 24 most
    significant bits equal the Fabric’s FC-MAP value and
    the least significant 24 bits equal the N_Port_ID
    assigned to the VN_Port by the FCF."

    This means that we need to compare the FC Frame's
    destination FCID against the embedded FCID in the
    destination MAC address. This patch checks the lower
    24 bits of the destination MAC address against
    destination FCID in the Fibre Channel frame.

    For MAC validation the first line of defense is the
    hardware MAC filtering. Each VN_Port will have a
    unicast MAC addresses added to the hardware's
    filtering table. The Ethernet driver should drop any
    MACs not destined for a programmed MAC. This patch
    adds a second line of defense that very specfically
    compares an element in the FC frame against an element
    in the Ethernet header, which is appropriate for the
    FCoE layer.

    Many alternative approaches were considered, including
    a LLD callback from libfc. The second most reasonable
    approach seemed to be walking the list of NPIV ports
    and check each of their MAC addresses against the
    destination MAC address of the received frame. The
    problem with this approach was that it is likely that
    performance would suffer with the more NPIV ports added
    to the system since every received frame would need to
    walk this list, comparing each entry's MAC.

    Signed-off-by: Robert Love
    Signed-off-by: James Bottomley

    Robert Love
     
  • Fix: When FIP frame is received, function fcoe_ctlr_vn_recv calls function
    fcoe_ctlr_vn_parse which does memset for addr (&buf.rdata) which leads to
    memory corruption. Code was trying to treat "buf" as struct but it was defined
    as union. Fix is to change from union to struct for "buf" in function fcoe_ctlr_vn_recv.

    Technical Details: N/A

    Signed-off-by: Kiran Patil
    Acked-by: Joe Eykholt
    Signed-off-by: Robert Love
    Signed-off-by: James Bottomley

    Kiran Patil
     
  • When number of NPIV ports created are greater than the xids
    allocated per pool -- for eg., creating 255 NPIV ports on a
    system with nr_cpu_ids of 32, with each pool containing 128
    xids -- and then generating a link event - for eg.,
    shutdown/no shutdown -- on the switch port causes the hang
    with the following stack trace.

    Call Trace:
    schedule_timeout+0x19d/0x230
    wait_for_common+0xc0/0x170
    __cancel_work_timer+0xcf/0x1b0
    fc_disc_stop+0x16/0x30 [libfc]
    fc_lport_reset_locked+0x47/0x90 [libfc]
    fc_lport_enter_reset+0x67/0xe0 [libfc]
    fc_lport_disc_callback+0xbc/0xe0 [libfc]
    fc_disc_done+0xa8/0xf0 [libfc]
    fc_disc_timeout+0x29/0x40 [libfc]
    run_workqueue+0xb8/0x140
    worker_thread+0x96/0x110
    kthread+0x96/0xa0
    child_rip+0xa/0x20

    Fix is to not cancel the disc_work if discovery is already
    stopped, thus allowing lport state machine to restart and try
    discovery again.

    Signed-off-by: Bhanu Prakash Gollapudi
    Acked-by: Joe Eykholt
    Signed-off-by: Robert Love
    Signed-off-by: James Bottomley

    Bhanu Prakash Gollapudi
     
  • It is unlikely but in case if it hits then it would cause panic
    due to null cmd ptr, so far only one instance seen recently with
    ESX though this was introduced long ago with this commit:-

    commit c1ecb90a66c5afc7cc5c9349f9c3714eef4a5cfb
    Author: Chris Leech
    Date: Thu Dec 10 09:59:26 2009 -0800
    [SCSI] libfc: reduce hold time on SCSI host lock

    Currently fsp->cmd is set to NULL w/o scsi_queue_lock before
    dequeuing from scsi_pkt_queue and that could cause NULL
    fsp->cmd in fc_fcp_cleanup_each_cmd for cmd completing
    with fsp->cmd = NULL after fc_fcp_cleanup_each_cmd taken
    reference. No need to set fsp->cmd to NULL as this is also
    protected by fc_fcp_lock_pkt(), for above race the
    fc_fcp_lock_pkt() in fc_fcp_cleanup_each_cmd() will fail
    as that cmd is already done.

    Mike mentioned same issue at
    http://www.open-fcoe.org/pipermail/devel/2010-September/010533.html

    Similarly moved sc_cmd->SCp.ptr = NULL under scsi_queue_lock so
    that scsi abort error handler won't abort on completed cmds.

    Signed-off-by: Vasu Dev
    Signed-off-by: Robert Love
    Signed-off-by: James Bottomley

    Vasu Dev
     
  • Since sometimes current FIP_MODE_AUTO mode falls back to non-FIP
    mode while DCB link still getting ready in fabric mode with
    its peer switch, it falls back after few libfc flogi retries
    and that is not we want while working with FIP enabled
    switches in FABRIC mode, therefore sets default as FIP_MODE_FABRIC
    as discussed and agreed before in this mail thread
    http://www.open-fcoe.org/pipermail/devel/2010-August/010511.html

    Signed-off-by: Vasu Dev
    Signed-off-by: Robert Love
    Signed-off-by: James Bottomley

    Vasu Dev
     
  • Sometimes switch in NPV mode rejects flogi request with DID
    zero and in that case flogi is not tried again and port
    remains offline, so this patch validates DID for non zero
    along with only ACC response to allow flogi retry
    for RJT with DID=0 also succeed FLOGI in next try.

    Signed-off-by: Vasu Dev
    Signed-off-by: Robert Love
    Signed-off-by: James Bottomley

    Vasu Dev
     
  • This is per Mile Christie feedback since in this case IO
    could get retried for tape devices and therefore DID_REQUEUE
    cannot be used, more details in this thread.

    http://marc.info/?l=linux-scsi&m=127970522630136&w=2

    Signed-off-by: Vasu Dev
    Signed-off-by: Robert Love
    Signed-off-by: James Bottomley

    Vasu Dev
     
  • There does not seem to be a reason why libfc adds a 5
    second delay to the user requested value for the dev loss
    tmo. There also does not seem to be a reason to allow
    setting it to 0 (or really close).

    This patch removes the extra 5 sec delay, and for 0 it
    sets it to 1 like other fc drivers. We should actually
    be able to set it to 0 since the queue_delayed_work API
    will just call queue_work, but other drivers set it to 1 in
    that case.

    Signed-off-by: Mike Christie
    Signed-off-by: Robert Love
    Signed-off-by: James Bottomley

    Mike Christie