29 Sep, 2008

1 commit


24 Sep, 2008

1 commit

  • Sometimes, particularly for USB devices with the last sector bug,
    requests get completed in chunks. There's a bug in this in that if
    one of the chunks gets an error, we complete that chunk with an error
    but never move on to the remaining ones, leading to the request
    hanging (because it's not fully completed).

    Fix this by completing all remaining chunks if an error is encountered.

    Cc: Alan Stern
    Signed-off-by: James Bottomley

    James Bottomley
     

14 Sep, 2008

1 commit

  • Josip Rodin noted
    (http://article.gmane.org/gmane.linux.ports.sparc/10152) the
    driver oopsing during registration of an rport to the
    FC-transport layer with a backtrace indicating a dereferencing of
    an shost->shost_data equal to NULL. David Miller identified a
    small window in driver logic where this could happen:

    > Look at how the driver registers the IRQ handler before the host has
    > been registered with the SCSI layer.
    >
    > That leads to a window of time where the shost hasn't been setup
    > fully, yet ISRs can come in and trigger DPC thread events, such as
    > loop resyncs, which expect the transport area to be setup.
    >
    > But it won't be setup, because scsi_add_host() hasn't finished yet.
    >
    > Note that in Josip's crash log, we don't even see the
    >
    > qla_printk(KERN_INFO, ha, "\n"
    > " QLogic Fibre Channel HBA Driver: %s\n"
    > " QLogic %s - %s\n"
    > " ISP%04X: %s @ %s hdma%c, host#=%ld, fw=%s\n",
    > ...
    >
    > message yet.
    >
    > Which means that the crash occurs between qla2x00_request_irqs()
    > and printing that message.

    Close this window by enabling RISC interrupts after the host has
    been registered with the SCSI midlayer.

    Reported-by: Josip Rodin
    Cc: Stable Tree
    Signed-off-by: Andrew Vasquez
    Signed-off-by: James Bottomley

    Andrew Vasquez
     

11 Sep, 2008

3 commits


29 Aug, 2008

5 commits

  • For IBM z series certain LUNs can no longer be accessed.

    This is because kernel version 2.6.19 a check was introduced not to
    create a generic SCSI device for devices that return PQ=1 and
    PDT=0x1f. For WLUNs (see SAM-3, p. 41ff) generic SCSI devices should
    be created unconditionally without looking at the PQ bit, so add a
    check for WLUNs in with this test.

    Acked-by: Martin Petermann
    Signed-off-by: James Bottomley

    James Bottomley
     
  • Change scsi_check_sense HARDWARE_ERROR check to return ADD_TO_MLQUEUE
    if device->retry_hwerror is set to allow retries to occur without
    restriction of blk_noretry_request check.

    Signed-off-by: Mike Anderson
    Signed-off-by: James Bottomley

    Mike Anderson
     
  • Change scsi_dh check_sense functions to return ADD_TO_MLQUEUE
    to allow retries to occur without restriction of blk_noretry_request
    check.

    Signed-off-by: Mike Anderson
    Acked-by: Chandra Seetharaman
    Signed-off-by: James Bottomley

    Mike Anderson
     
  • Signed-off-by: Stefan Richter
    Acked-by: "Martin K. Petersen"
    Signed-off-by: James Bottomley

    Stefan Richter
     
  • This patch remove blk_register_filter and blk_unregister_filter in
    gendisk, and adds them to sd.c, sr.c. and ide-cd.c

    The commit abf5439370491dd6fbb4fe1a7939680d2a9bc9d4 moved cmdfilter
    from gendisk to request_queue. It turned out that in some subsystems
    multiple gendisks share a single request_queue. So we get:

    Using physmap partition information
    Creating 3 MTD partitions on "physmap-flash":
    0x00000000-0x01c00000 : "User FS"
    0x01c00000-0x01c40000 : "booter"
    kobject (8511c410): tried to init an initialized object, something is seriously wrong.
    Call Trace:
    [] dump_stack+0x8/0x34
    [] kobject_init+0x50/0xcc
    [] kobject_init_and_add+0x24/0x58
    [] blk_register_filter+0x4c/0x64
    [] add_disk+0x78/0xe0
    [] add_mtd_blktrans_dev+0x254/0x278
    [] blktrans_notify_add+0x40/0x78
    [] add_mtd_device+0xd0/0x150
    [] add_mtd_partitions+0x568/0x5d8
    [] physmap_flash_probe+0x2ac/0x334
    [] driver_probe_device+0x12c/0x244
    [] __driver_attach+0x4c/0x84
    [] bus_for_each_dev+0x58/0xac
    [] bus_add_driver+0xc4/0x24c
    [] driver_register+0xcc/0x184
    [] _stext+0x60/0x1bc

    In the long term, we need to fix such subsystems but we need a quick
    fix now. This patch add the command filter support to only sd and sr
    though it might be useful for other SG_IO users (such as cciss).

    Signed-off-by: FUJITA Tomonori
    Reported-by: Manuel Lauss
    Signed-off-by: Jens Axboe

    FUJITA Tomonori
     

27 Aug, 2008

2 commits

  • sg allowed any command for TYPE_SCANNER. The cmd_filter patchset
    doesn't. We can't change sg's permission since it might break the
    existing software.

    Signed-off-by: FUJITA Tomonori
    Signed-off-by: Jens Axboe

    FUJITA Tomonori
     
  • cmd_filter works only for the block layer SG_IO with SCSI block
    devices. It breaks scsi/sg.c, bsg, and the block layer SG_IO with SCSI
    character devices (such as st). We hit a kernel crash with them.

    The problem is that cmd_filter code accesses to gendisk (having struct
    blk_scsi_cmd_filter) via inode->i_bdev->bd_disk. It works for only
    SCSI block device files. With character device files, inode->i_bdev
    leads you to struct cdev. inode->i_bdev->bd_disk->blk_scsi_cmd_filter
    isn't safe.

    SCSI ULDs don't expose gendisk; they keep it private. bsg needs to be
    independent on any protocols. We shouldn't change ULDs to expose their
    gendisk.

    This patch moves struct blk_scsi_cmd_filter from gendisk to
    request_queue, a common object, which eveyone can access to.

    The user interface doesn't change; users can change the filters via
    /sys/block/. gendisk has a pointer to request_queue so the cmd_filter
    code accesses to struct blk_scsi_cmd_filter.

    Signed-off-by: FUJITA Tomonori
    Signed-off-by: Jens Axboe

    FUJITA Tomonori
     

24 Aug, 2008

1 commit


20 Aug, 2008

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (22 commits)
    [SCSI] ibmvfc: Driver version 1.0.2
    [SCSI] ibmvfc: Add details to async event log
    [SCSI] ibmvfc: Sanitize response lengths
    [SCSI] ibmvfc: Fix for lost async events
    [SCSI] ibmvfc: Fixup host state during reinit
    [SCSI] ibmvfc: Fix another hang on module removal
    [SCSI] ibmvscsi: Fixup desired DMA value for shared memory partitions
    [SCSI] megaraid_sas: remove sysfs dbg_lvl world writeable permissions
    [SCSI] qla2xxx: Update version number to 8.02.01-k7.
    [SCSI] qla2xxx: Explicitly tear-down vports during PCI remove_one().
    [SCSI] qla2xxx: Reference proper ha during SBR handling.
    [SCSI] qla2xxx: Set npiv_supported flag for FCoE HBAs.
    [SCSI] qla2xxx: Don't leak SG-DMA mappings while aborting commands.
    [SCSI] qla2xxx: Correct vport-state management issues during ISP-ABORT.
    [SCSI] qla2xxx: Correct synchronization of software/firmware fcport states.
    [SCSI] scsi_dh: Initialize lun_state in check_ownership()
    [SCSI] scsi_dh: Do not use scsilun in rdac hardware handler
    [SCSI] megaraid_sas: version and Documentation Update
    [SCSI] megaraid_sas: add new controllers (0x78 0x79)
    [SCSI] megaraid_sas: add the shutdown DCMD cmd to driver shutdown routine
    ...

    Linus Torvalds
     

16 Aug, 2008

21 commits

  • Bump driver version to 1.0.2.

    Signed-off-by: Brian King
    Signed-off-by: James Bottomley

    Brian King
     
  • When logging async events, also print the payload in addition to the
    event received.

    Signed-off-by: Brian King
    Signed-off-by: James Bottomley

    Brian King
     
  • Sanitize the response lengths in order to prevent possible oopses
    in the command response path.

    Signed-off-by: Brian King
    Signed-off-by: James Bottomley

    Brian King
     
  • If the client virtual fibre channel adapter is already logged into the
    server and does an NPIV Login again, the async queue, which is used for
    reporting Link Up/Link Down type of events, does not get reset on the
    server side. Fix up the client driver so that we also do not reset it.
    This fixes a problem of lost async events following relogins.

    Signed-off-by: Brian King
    Signed-off-by: James Bottomley

    Brian King
     
  • If an ELS is received while the virtual fibre channel adapter is going
    through its discovery, a flag is set which causes discovery to get
    re-driven. However, the hosts's state does not get set back to
    IBMVFC_INITIALIZING and scsi_block_requests does not get called again,
    which can result in queuecommand ops getting sent during
    discovery. This should not occur and may cause problems. One example
    is that we may no longer be logged into the target we send the command
    to, resulting in a failure which should not have occurred.

    Signed-off-by: Brian King
    Signed-off-by: James Bottomley

    Brian King
     
  • This fixes a hang on module removal. The module removal code was setting
    the hosts's state to IBMVFC_HOST_OFFLINE before tearing down the kernel
    thread, but, due to a bug in ibmvfc_wait_while_resetting, was not waiting
    for the kernel thread's offlining work to be done prior to destroying
    the kernel thread, which left the scsi host in a blocked state which we
    never got out of.

    Signed-off-by: Brian King
    Signed-off-by: James Bottomley

    Brian King
     
  • When running ibmvscsi in a shared memory partition, it must provide
    a default value for the amount of DMA resources it will need in order to
    perform reasonably well. This was being calculated in sectors rather than
    bytes, as it should. This patch fixes this.

    Signed-off-by: Brian King
    Signed-off-by: James Bottomley

    Brian King
     
  • /sys/bus/pci/drivers/megaraid_sas/dbg_lvl defaults to being
    world-writable, which seems bad (letting any user affect kernel driver
    behavior and logging level).

    This turns off group and user write permissions, so that on typical
    production systems only root can write to it.

    [jejb: fix up rejections]
    Signed-off-by: Joseph Malicki
    Acked-by: "Yang, Bo"
    Signed-off-by: James Bottomley

    Joe Malicki
     
  • Signed-off-by: Andrew Vasquez
    Signed-off-by: James Bottomley

    Andrew Vasquez
     
  • During internal testing, we've seen issues (hangs) with the
    'deferred' vport tear-down-processing typically accompanied with
    the fc_remove_host() call. This is due to the current
    implementation's back-end vport handling being performed by the
    physical-HA's DPC thread where premature shutdown could lead to
    latent vport requests without a processor.

    This should also address a problem reported by Gal Rosen
    (http://marc.info/?l=linux-scsi&m=121731664417358&w=2) where the
    driver would attempt to awaken a previously torn-down DPC thread
    from interrupt context by implicitly calling wake_up_process()
    rather than the driver's qla2xxx_wake_dpc() helper. Rather, than
    reshuffle the remove_one() device-removal code, during unload,
    depend on the driver's timer to wake-up the DPC process, by
    limiting wake-ups based on an 'unloading' flag.

    Signed-off-by: Andrew Vasquez
    Signed-off-by: James Bottomley

    Andrew Vasquez
     
  • The executing-HA of an SRB can be referenced from the sp->fcport.
    Use this correct value while processing status-continuation data
    and abort processing.

    Signed-off-by: Andrew Vasquez
    Signed-off-by: James Bottomley

    Andrew Vasquez
     
  • Signed-off-by: Andrew Vasquez
    Signed-off-by: James Bottomley

    Mike Hernandez
     
  • Original code inadvertently cleared an SRB's 'flags' while
    aborting; causing a follow-on scsi_dma_unmap() to be potentially
    missed.

    Signed-off-by: Andrew Vasquez
    Signed-off-by: James Bottomley

    Andrew Vasquez
     
  • * Use correct 'ha' to mark a device lost from ISR.
    I/Os will always be returned on the physical-HA.
    qla2x00_mark_device_lost() should be called with the HA bound
    to the fcport.
    * Mark *all* devices lost during ISP-ABORT (bighammer).

    These fixes correct issues discovered locally where during
    link-perturbation and heavy vport-I/O fcport/rport states would
    stray and an rport's scsi-target lost (timed-out).

    Signed-off-by: Andrew Vasquez
    Signed-off-by: James Bottomley

    Andrew Vasquez
     
  • Greg Wettstein (greg@enjellic.com) noted:

    http://article.gmane.org/gmane.linux.scsi/43409

    on a reboot of a previously recognized SCST target, the initiator
    driver would be unable to re-recognize the device as a target.
    It turns out that prior to the SCST software reloading and
    returning it's "target-capable" abilities in the PRLI payload,
    the HBA would be re-initialized as an initiator-only type port.
    Since initiators typically classify themselves as an FCP-2
    capable device, both software and firmware do not perform an
    explicit logout during port-loss. Unfortunately, as can be seen
    by the failure case, when the port (now target-capable) returns,
    firmware performs an ADISC without a follow-on PRLI, leaving
    stale 'initiator-only' data in the firmware's port database.

    Correct the discrepancy by performing the explicit logout during
    the transport's request to terminate-rport-io, thus synchronizing
    port states and ensuring a follow-on PRLI is performed.

    Reported-by: Greg Wettstein
    Signed-off-by: Andrew Vasquez
    Cc: Stable Tree
    Signed-off-by: James Bottomley

    Andrew Vasquez
     
  • lun_state need to be initialized inside check_ownership().

    Signed-off-by: Chandra Seetharaman
    Signed-off-by: James Bottomley

    Chandra Seetharaman
     
  • RDAC storage controller doesn't seem to use the scsilun format. It uses
    only the last byte for LUN.

    Signed-off-by: Chandra Seetharaman
    Signed-off-by: James Bottomley

    Chandra Seetharaman
     
  • Signed-off-by: Bo Yang
    Signed-off-by: Andrew Morton
    Signed-off-by: James Bottomley

    Yang, Bo
     
  • Add the new controllers (0x78 0x79) support to the driver. Those
    controllers are LSI's next generation (gen2) SAS controllers.

    [akpm@linux-foundation.org: coding-style fixes]
    [akpm@linux-foundation.org: parenthesise a macro]
    Signed-off-by: Bo Yang
    Signed-off-by: Andrew Morton
    Signed-off-by: James Bottomley

    Yang, Bo
     
  • Add the shutdown DCMD cmd to driver shutdown routine to make megaraid sas
    FW shutdown proper.

    Signed-off-by: Bo Yang
    Signed-off-by: Andrew Morton
    Signed-off-by: James Bottomley

    Yang, Bo
     
  • MegaRAID SAS Driver get unexpected Interrupt. Add the dummy readl to
    force PCI flush will fix this issue.

    Signed-off-by: Bo Yang
    Signed-off-by: Andrew Morton
    Cc: Stable Tree
    Signed-off-by: James Bottomley

    Yang, Bo
     

12 Aug, 2008

1 commit


07 Aug, 2008

3 commits