15 Mar, 2006

2 commits


12 Mar, 2006

1 commit


28 Feb, 2006

7 commits


15 Feb, 2006

1 commit

  • There's a bug in releasing scsi_device where the release function
    actually frees the block queue. However, the block queue release
    calls flush_work(), which requires process context (the scsi_device
    structure may release from irq context). Update the release function
    to invoke via the execute_in_process_context() API.

    Also clean up the scsi_target structure releasing via this API.

    Signed-off-by: James Bottomley

    James Bottomley
     

15 Jan, 2006

1 commit

  • When James Smart fixed the issue of the userspace scan atributes
    crashing the system with the FC transport class he added a patch to
    let the transport class check if the parent is valid for a given
    transport class.

    When adding support for the integrated raid of fusion sas devices
    we ran into a problem with that, as it didn't allow adding virtual
    raid volumes without the transport class knowing about it.

    So this patch adds a user_scan attribute instead, that takes over from
    scsi_scan_host_selected if the transport class sets it and thus lets
    the transport class control the user-initiated scanning. As this
    plugs the hole about user-initiated scanning the target_parent hook
    goes away and we rely on callers of the scanning routines to do
    something sensible.

    For SAS this meant I had to switch from a spinlock to a mutex to
    synchronize the topology linked lists, in FC they were completely
    unsynchronized which seems wrong.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: James Bottomley

    Christoph Hellwig
     

13 Jan, 2006

1 commit


05 Jan, 2006

1 commit


27 Dec, 2005

1 commit

  • The oops is characteristic of the underlying device being removed from
    visibility before the class device, and sure enough we do device_del()
    before transport_unregister() in the scsi_target_reap() routines. I've
    no idea why this is suddenly showing up, since the code has been in
    there since that function was first invented. However, I've confirmed
    this fixes Andrew Vasquez's boot oops.

    Signed-off-by: James Bottomley
    Signed-off-by: Linus Torvalds

    James Bottomley
     

18 Dec, 2005

1 commit


14 Dec, 2005

1 commit


13 Dec, 2005

1 commit

  • There is a double free in the scsi scan code if a LLDD's slave_alloc()
    call fails. There is a direct call to scsi_free_queue and then the
    following put_device calls the release function, which also frees the
    queue.

    Remove the redundant scsi_free_queue.

    Signed-off-by: Brian King
    Tested-by: Nathan Lynch
    [ Also removed some strange whitespace artifacts in that area ]
    Signed-off-by: Linus Torvalds

    Brian King
     

10 Nov, 2005

1 commit


09 Nov, 2005

1 commit


29 Oct, 2005

3 commits


26 Sep, 2005

1 commit


19 Sep, 2005

1 commit

  • This patch (as545) fixes the list traversals in __scsi_remove_target and
    scsi_forget_host. In each case the existing code list_for_each_entry_safe
    in an _unsafe_ manner, because the list was not protected from outside
    modification while the iteration was running.

    The new scsi_forget_host routine takes the moderately controversial step
    of iterating over devices for removal rather than iterating over targets.
    This makes more sense to me because the current scheme treats targets as
    second-class citizens, created and removed on demand, rather than as
    objects corresponding to actual hardware. (Also I couldn't figure out any
    safe way to iterate over the target list, since it's not so easy to tell
    when a target has already been removed.)

    Signed-off-by: Alan Stern
    Signed-off-by: James Bottomley

    Alan Stern
     

11 Sep, 2005

2 commits

  • The original API returned either an ERR_PTR() or a refcounted sdev.
    Unfortunately, if it's successful, you need to do a scsi_device_put() on
    the sdev otherwise the refcounting is wrong.

    Everyone seems to expect that scsi_add_device() should be callable
    without doing the ref put, so alter the API so it is (we still have
    __scsi_add_device with the original behaviour).

    The only actual caller that needs altering is the one in firewire ...
    not because it gets this right, but because it acts on the error if one
    is returned.

    Acked-by: Stefan Richter
    Signed-off-by: James Bottomley

    James Bottomley
     
  • This patch (as546) fixes an oops-causing failure to check the return code
    from scsi_device_get. The call can return an error if the LLD is being
    unloaded from memory.

    Signed-off-by: Alan Stern
    Signed-off-by: James Bottomley

    Alan Stern
     

09 Sep, 2005

1 commit

  • This patch (as543) adds a private entry point to scsi_scan_target, for use
    when the caller already owns the scan_mutex, and updates the kerneldoc for
    that routine (which was badly out-of-date). It converts scsi_scan_channel
    to use the new entry point. Lastly, it modifies scsi_get_host_dev to make
    it acquire the scan_mutex, necessary since the routine adds a new
    scsi_device even if it doesn't do any actual scanning.

    Signed-off-by: Alan Stern
    Signed-off-by: James Bottomley

    Alan Stern
     

29 Aug, 2005

2 commits


28 Aug, 2005

1 commit


09 Aug, 2005

1 commit

  • We have some nasty issues with 2.6.12-rc6. Any request to scan on
    the lpfc or qla2xxx FC adapters will oops. What is happening is the
    system is defaulting to non-transport registered targets, which
    inherit the parent of the scan. On this second scan, performed by
    the attribute, the parent becomes the shost instead of the rport.
    The slave functions in the 2 FC adapters use starget_to_rport()
    routines, which incorrectly map the shost as an rport pointer.

    Additionally, this pointed out other weaknesses:
    - If the target structure is torn down outside of the transport,
    we have no method for it to be regenerated at the proper parent.
    - We have race conditions on the target being allocated by both
    the midlayer scan (parent=shost) and by the fc transport
    (parent=rport).

    Signed-off-by: James Bottomley

    James.Smart@Emulex.Com
     

31 Jul, 2005

1 commit


28 Jul, 2005

1 commit

  • Adds a missing check for an error return code from scsi_sysfs_add_sdev.
    This resolves entry #4863 in the OSDL bugzilla. Although in that bug
    report the failure occurred because of a confusion over scanning vs.
    rescanning, in general add_sdev can fail for a number of reasons (the
    simplest being insufficient memory) and the caller should cope properly.

    Signed-off-by: Alan Stern
    Cc: James Bottomley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Stern
     

14 Jul, 2005

1 commit

  • One of the issues we had was reverting the midlayers lun value
    into the 8byte lun value that we wanted to send to the device.
    Historically, there's been some combination of byte swapping,
    setting high/low, etc. There's also been no common thread between
    how our driver did it and others. I also got very confused as
    to why byteswap routines were being used.

    Anyway, this patch is a LLDD-callable function that reverts the
    midlayer's lun value, stored in an int, to the 8-byte quantity
    (note: this is not the real 8byte quantity, just the same amount
    that scsilun_to_int() was able to convert and store originally).

    This also solves the dilemma of the thread:
    http://marc.theaimsgroup.com/?l=linux-kernel&m=112116767118981&w=2

    A patch for the lpfc driver to use this function will be along
    in a few days (batched with other patches).

    Signed-off-by: James Bottomley

    James.Smart@Emulex.Com
     

18 Jun, 2005

1 commit


03 Jun, 2005

1 commit

  • With CONFIG_DEBUG_SLAB=y I see slab corruption messages during boot on
    pSeries machines with IPR adapters with any 2.6.12-rc kernel.

    The change which seems to have introduced the problem is "SCSI: revamp
    target scanning routines" and may be found at:
    http://marc.theaimsgroup.com/?l=bk-commits-head&m=111093946426333&w=2

    In order to revert that in a 2.6.12-rc1 tree, I had to revert "target
    code updates to support scanned targets" first:
    http://marc.theaimsgroup.com/?l=bk-commits-head&m=111094132524649&w=2

    With both patches reverted, the corruption messages go away.

    ipr: IBM Power RAID SCSI Device Driver version: 2.0.13 (February 21,
    2005)
    ipr 0001:d0:01.0: Found IOA with IRQ: 167
    ipr 0001:d0:01.0: Starting IOA initialization sequence.
    ipr 0001:d0:01.0: Adapter firmware version: 020A005C
    ipr 0001:d0:01.0: IOA initialized.
    scsi0 : IBM 570B Storage Adapter
    Vendor: IBM Model: VSBPD4E1 U4SCSI Rev: 4770
    Type: Enclosure ANSI SCSI revision: 02
    Vendor: IBM H0 Model: HUS103036FL3800 Rev: RPQF
    Type: Direct-Access ANSI SCSI revision: 04
    Vendor: IBM H0 Model: HUS103036FL3800 Rev: RPQF
    Type: Direct-Access ANSI SCSI revision: 04
    Vendor: IBM H0 Model: HUS103036FL3800 Rev: RPQF
    Type: Direct-Access ANSI SCSI revision: 04
    Vendor: IBM H0 Model: HUS103036FL3800 Rev: RPQF
    Type: Direct-Access ANSI SCSI revision: 04
    Vendor: IBM Model: VSBPD4E1 U4SCSI Rev: 4770
    Type: Enclosure ANSI SCSI revision: 02
    Slab corruption: start=c0000001e8de5268, len=512
    Redzone: 0x5a2cf071/0x5a2cf071.
    Last user: [](.scsi_target_dev_release+0x28/0x50)
    080: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6a
    Prev obj: start=c0000001e8de5050, len=512
    Redzone: 0x5a2cf071/0x5a2cf071.
    Last user: [](0x0)
    000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
    010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
    Next obj: start=c0000001e8de5480, len=512
    Redzone: 0x170fc2a5/0x170fc2a5.
    Last user: [](.as_init_queue+0x5c/0x228)
    000: c0 00 00 01 e8 83 26 08 00 00 00 00 00 00 00 00
    010: 00 00 00 00 00 00 00 00 c0 00 00 01 e8 de 54 98
    Slab corruption: start=c0000001e8de5268, len=512
    Redzone: 0x5a2cf071/0x5a2cf071.
    Last user: [](.scsi_target_dev_release+0x28/0x50)
    080: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6a
    Prev obj: start=c0000001e8de5050, len=512
    Redzone: 0x5a2cf071/0x5a2cf071.
    Last user: [](0x0)
    000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
    010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
    Next obj: start=c0000001e8de5480, len=512
    Redzone: 0x170fc2a5/0x170fc2a5.
    Last user: [](.as_init_queue+0x5c/0x228)
    000: c0 00 00 01 e8 83 26 08 00 00 00 00 00 00 00 00
    010: 00 00 00 00 00 00 00 00 c0 00 00 01 e8 de 54 98
    ...

    I did some digging and the problem seems to be a refcounting issue in
    __scsi_add_device. The target gets freed in scsi_target_reap, and
    then __scsi_add_device tries to do another device_put on it.

    Signed-off-by: Nathan Lynch
    Signed-off-by: James Bottomley

    Nathan Lynch
     

26 May, 2005

2 commits

  • This gives the HBA driver notice when a target is created and
    destroyed to allow it to manage its own target based allocations
    accordingly.

    This is a much reduced verson of the original patch sent in by
    James.Smart@Emulex.com

    Signed-off-by: James Bottomley

    James Bottomley
     
  • a) TYPE_SDAD renamed to TYPE_RBC and taken to scsi.h
    b) in sbp2.c remapping of TYPE_RPB to TYPE_DISK turned off
    c) relevant places in midlayer and sd.c taught to accept TYPE_RBC
    d) sd.c::sd_read_cache_type() looks into page 6 when dealing with
    TYPE_RBC - these guys have writeback cache flag there and are not guaranteed
    to have page 8 at all.
    e) sd_read_cache_type() got an extra sanity check - it checks that
    it got the page it asked for before using its contents. And screams if
    mismatch had happened. Rationale: there are broken devices out there that
    are "helpful" enough to go for "I don't have a page you've asked for, here,
    have another one". For example, PL3507 had been caught doing just that...
    f) sbp2 sets sdev->use_10_for_rw and sdev->use_10_for_ms instead
    of bothering to remap READ6/WRITE6/MOD_SENSE, so most of the conversions
    in there are gone now.

    Incidentally, I wonder if USB storage devices that have no
    mode page 8 are simply RBC ones. I haven't touched that, but it might
    be interesting to check...

    Signed-off-by: Al Viro
    Signed-off-by: James Bottomley

    Al Viro
     

25 Apr, 2005

1 commit

  • Somebody forgot that | has higher priority than ?:. As the result,
    allocation is done with bogus flags - instead of GFP_ATOMIC + possibly
    GFP_DMA we always get GFP_DMA and no GFP_ATOMIC.

    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Al Viro