07 Jan, 2012

1 commit


04 Jan, 2012

4 commits


15 Dec, 2011

1 commit


14 Dec, 2011

5 commits

  • During session recovery, the conn_stop call will trigger a flush
    to all outstanding SCSI cmds in the xmit queue. This will set
    all outstanding task->sc to NULL prior to the session_teardown
    call which frees the task memory.

    In the bnx2i SCSI response processing path, only the task was being checked
    for NULL under the session lock before the task->sc->request dereferencing.
    If there are outstanding SCSI cmd responses pending for process, the
    following kernel panic can be exposed where task->sc was found to be NULL.

    Call Trace:
    [ 69.720205] [] bnx2i_process_new_cqes+0x290/0x3c0 [bnx2i]
    [ 69.804289] [] bnx2i_fastpath_notification+0x33/0xa0 [bnx2
    i]
    [ 69.891490] [] bnx2i_indicate_kcqe+0xdb/0x330 [bnx2i]
    [ 69.971427] [] service_kcqes+0x16e/0x1d0 [cnic]
    [ 70.045132] [] cnic_service_bnx2x_kcq+0x2a/0x50 [cnic]
    [ 70.126105] [] cnic_service_bnx2x_bh+0x43/0x140 [cnic]
    [ 70.207081] [] tasklet_action+0x66/0x110
    [ 70.273521] [] __do_softirq+0xef/0x220
    [ 70.337887] [] call_softirq+0x1c/0x30

    This patch adds the !task->sc check and also protects the sc dereferencing
    under the session lock.

    Signed-off-by: Eddie Wai
    Signed-off-by: James Bottomley

    Eddie Wai
     
  • iscsi_conn_setup can fail so we must check for NULL being
    returned.

    Signed-off-by: Mike Christie
    Signed-off-by: James Bottomley

    Mike Christie
     
  • When the qla4xxx_get_fwddb_entry returns QLA_ERROR
    the nex_idx is not updated,
    for (idx = 0; idx < max_ddbs; idx = next_idx) {
    ret = qla4xxx_get_fwddb_entry(ha, idx, NULL, 0, NULL,
    &next_idx, &state, &conn_err,
    NULL, NULL);
    if (ret == QLA_ERROR)
    continue;

    This means there is a risk that the 'idx < max_ddbs' condition will never
    met and the loop will loop forever.
    Fix this by explicitly increasing the next_idx in the error condition.

    Maybe a break instead of continue is more appropriate, leaving the decision
    on the qlogic maintainer.

    Signed-off-by: Tomas Henzl
    Signed-off-by: Mike Christie
    Signed-off-by: James Bottomley

    Tomas Henzl
     
  • With open-iscsi support, target entries persisted in the FLASH were not
    login. Added support in the qla4xxx driver to do the login on probe
    time to the target entries saved in the FLASH by user.
    With this changes upgrade to the new kernel with open-iscsi support in
    qla4xxx will ensure users original target entries login on driver load

    Signed-off-by: Manish Rangankar
    Signed-off-by: Ravi Anand
    Signed-off-by: Mike Christie
    Signed-off-by: James Bottomley

    Mike Christie
     
  • The error exit path leaks preempt count. Add the missing put_cpu().

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Yi Zou
    Cc: stable@kernel.org
    Signed-off-by: James Bottomley

    Thomas Gleixner
     

12 Dec, 2011

15 commits


15 Nov, 2011

1 commit

  • The Windows driver .inf disables ASPM on hpsa devices. Do the same because the
    selection of a non default ASPM policy can cause the device to hang.

    Signed-off-by: Matthew Garrett
    Cc: stable@kernel.org
    Acked-by: Mike Miller
    Signed-off-by: James Bottomley

    Matthew Garrett
     

11 Nov, 2011

2 commits

  • Aacraid controller can hang on some nodes if kernel uses non-default
    (powersave) ASPM policy. Controller hangs shortly after successful load and
    hardware detection. Scsi error handler detects this hang and tries to restart
    hardware but it does not help.

    Initially it was noticed on RHEL6-based openVZ kernel after backporting
    aacraid driver from mainline (RHEL6 kernel with original driver works well)
    http://bugzilla.openvz.org/show_bug.cgi?id=2043

    This issue happens because default ASPM policy was changed in Red Hat
    kernels. Therefore guys from Red Hat have noticed this problem long time ago:
    on Fedora 12
    https://bugzilla.redhat.com/show_bug.cgi?id=540478
    on Fedora 14
    https://bugzilla.redhat.com/show_bug.cgi?id=679385

    In RHEL6 kernel this issue was fixed, ASPM was disabled in aacraid driver. In
    kernel changelog I've found that seems it was done by Matthew Garrett: -
    [scsi] aacraid: Disable ASPM by default (Matthew Garrett) [599735]

    However seems this patch was not submitted to mainline. I've reproduced this
    issue on vanilla 3.1.0 kernel booted with "pcie_aspm.policy=powersave" option,
    So I believe it makes sense to do it now.

    Signed-off-by: Vasily Averin
    [mjg: Checking the Windows drivers indicates that they disable ASPM under all
    circumstances, so:]
    Acked-by: Matthew Garrett
    Acked-by: Achim Leubner
    Cc: stable@kernel.org
    Signed-off-by: James Bottomley

    Vasily Averin
     
  • There was supposed to be a kzalloc() here and the compiler complained
    about it.
    mpt2sas_scsih.c: In function ‘mpt2sas_scsih_reset_handler’:
    mpt2sas_scsih.c:2807:21: warning: ‘fw_event’ may be used uninitialized in this function [-Wuninitialized]

    Signed-off-by: Dan Carpenter
    Acked-by: "Nandigama, Nagalakshmi"
    Signed-off-by: James Bottomley

    Dan Carpenter
     

10 Nov, 2011

2 commits

  • When we tear down a device we try to flush all outstanding
    commands in scsi_free_queue(). However the check in
    scsi_request_fn() is imperfect as it only signals that
    we _might start_ aborting commands, not that we've actually
    aborted some.
    So move the printk inside the scsi_kill_request function,
    this will also give us a hint about which commands are aborted.

    Signed-off-by: Hannes Reinecke
    Signed-off-by: James Bottomley

    Hannes Reinecke
     
  • On Mon, 2011-11-07 at 17:24 +1100, Stephen Rothwell wrote:
    > Hi all,
    >
    > Starting some time last week I am getting the following during boot on
    > our PPC970 blade:
    >
    > calling .ipr_init+0x0/0x68 @ 1
    > ipr: IBM Power RAID SCSI Device Driver version: 2.5.2 (April 27, 2011)
    > ipr 0000:01:01.0: Found IOA with IRQ: 26
    > ipr 0000:01:01.0: Starting IOA initialization sequence.
    > ipr 0000:01:01.0: Adapter firmware version: 06160039
    > ipr 0000:01:01.0: IOA initialized.
    > scsi0 : IBM 572E Storage Adapter
    > ------------[ cut here ]------------
    > WARNING: at drivers/scsi/scsi_lib.c:1704
    > Modules linked in:
    > NIP: c00000000053b3d4 LR: c00000000053e5b0 CTR: c000000000541d70
    > REGS: c0000000783c2f60 TRAP: 0700 Not tainted (3.1.0-autokern1)
    > MSR: 8000000000029032 CR: 24002024 XER: 20000002
    > TASK = c0000000783b8000[1] 'swapper' THREAD: c0000000783c0000 CPU: 0
    > GPR00: 0000000000000001 c0000000783c31e0 c000000000cf38b0 c00000000239a9d0
    > GPR04: c000000000cbe8f8 0000000000000000 c0000000783c3040 0000000000000000
    > GPR08: c000000075daf488 c000000078a3b7ff c000000000bcacc8 0000000000000000
    > GPR12: 0000000044002028 c000000007ffb000 0000000002e40000 000000000099b800
    > GPR16: 0000000000000000 c000000000bba5fc c000000000a61db8 0000000000000000
    > GPR20: 0000000001b77200 0000000000000000 c000000078990000 0000000000000001
    > GPR24: c000000002396828 0000000000000000 0000000000000000 c000000078a3b938
    > GPR28: fffffffffffffffa c0000000008ad2c0 c000000000c7faa8 c00000000239a9d0
    > NIP [c00000000053b3d4] .scsi_free_queue+0x24/0x90
    > LR [c00000000053e5b0] .scsi_alloc_sdev+0x280/0x2e0
    > Call Trace:
    > [c0000000783c31e0] [c000000000c7faa8] wireless_seq_fops+0x278d0/0x2eb88 (unreliable)
    > [c0000000783c3270] [c00000000053e5b0] .scsi_alloc_sdev+0x280/0x2e0
    > [c0000000783c3330] [c00000000053eba0] .scsi_probe_and_add_lun+0x390/0xb40
    > [c0000000783c34a0] [c00000000053f7ec] .__scsi_scan_target+0x16c/0x650
    > [c0000000783c35f0] [c00000000053fd90] .scsi_scan_channel+0xc0/0x100
    > [c0000000783c36a0] [c00000000053fefc] .scsi_scan_host_selected+0x12c/0x1c0
    > [c0000000783c3750] [c00000000083dcb4] .ipr_probe+0x2c0/0x390
    > [c0000000783c3830] [c0000000003f50b4] .local_pci_probe+0x34/0x50
    > [c0000000783c38a0] [c0000000003f5f78] .pci_device_probe+0x148/0x150
    > [c0000000783c3950] [c0000000004e1e8c] .driver_probe_device+0xdc/0x210
    > [c0000000783c39f0] [c0000000004e20cc] .__driver_attach+0x10c/0x110
    > [c0000000783c3a80] [c0000000004e1228] .bus_for_each_dev+0x98/0xf0
    > [c0000000783c3b30] [c0000000004e1bf8] .driver_attach+0x28/0x40
    > [c0000000783c3bb0] [c0000000004e07d8] .bus_add_driver+0x218/0x340
    > [c0000000783c3c60] [c0000000004e2a2c] .driver_register+0x9c/0x1b0
    > [c0000000783c3d00] [c0000000003f62d4] .__pci_register_driver+0x64/0x140
    > [c0000000783c3da0] [c000000000b99f88] .ipr_init+0x4c/0x68
    > [c0000000783c3e20] [c00000000000ad24] .do_one_initcall+0x1a4/0x1e0
    > [c0000000783c3ee0] [c000000000b512d0] .kernel_init+0x14c/0x1fc
    > [c0000000783c3f90] [c000000000022468] .kernel_thread+0x54/0x70
    > Instruction dump:
    > ebe1fff8 7c0803a6 4e800020 7c0802a6 fba1ffe8 fbe1fff8 7c7f1b78 f8010010
    > f821ff71 e8030398 3120ffff 7c090110 e86303b0 482de065 60000000
    > ---[ end trace 759bed76a85e8dec ]---
    > scsi 0:0:1:0: Direct-Access IBM-ESXS MAY2036RC T106 PQ: 0 ANSI: 5
    > ------------[ cut here ]------------
    >
    > I get lots more of these. The obvious commit to point the finger at
    > is 3308511c93e6 ("[SCSI] Make scsi_free_queue() kill pending SCSI
    > commands") but the root cause may be something different.

    Caused by

    commit f7c9c6bb14f3104608a3a83cadea10a6943d2804
    Author: Anton Blanchard
    Date: Thu Nov 3 08:56:22 2011 +1100

    [SCSI] Fix block queue and elevator memory leak in scsi_alloc_sdev

    Doesn't completely do the teardown. The true fix is to do a proper
    teardown instead of hand rolling it

    Reported-by: Stephen Rothwell
    Tested-by: Stephen Rothwell
    Cc: stable@kernel.org #2.6.38+
    Signed-off-by: James Bottomley

    James Bottomley
     

07 Nov, 2011

2 commits

  • * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
    Revert "tracing: Include module.h in define_trace.h"
    irq: don't put module.h into irq.h for tracking irqgen modules.
    bluetooth: macroize two small inlines to avoid module.h
    ip_vs.h: fix implicit use of module_get/module_put from module.h
    nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
    include: replace linux/module.h with "struct module" wherever possible
    include: convert various register fcns to macros to avoid include chaining
    crypto.h: remove unused crypto_tfm_alg_modname() inline
    uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
    pm_runtime.h: explicitly requires notifier.h
    linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
    miscdevice.h: fix up implicit use of lists and types
    stop_machine.h: fix implicit use of smp.h for smp_processor_id
    of: fix implicit use of errno.h in include/linux/of.h
    of_platform.h: delete needless include
    acpi: remove module.h include from platform/aclinux.h
    miscdevice.h: delete unnecessary inclusion of module.h
    device_cgroup.h: delete needless include
    net: sch_generic remove redundant use of
    net: inet_timewait_sock doesnt need
    ...

    Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
    - drivers/media/dvb/frontends/dibx000_common.c
    - drivers/media/video/{mt9m111.c,ov6650.c}
    - drivers/mfd/ab3550-core.c
    - include/linux/dmaengine.h

    Linus Torvalds
     
  • * 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
    scsi: drop unused Kconfig symbol
    pci: drop unused Kconfig symbol
    stmmac: drop unused Kconfig symbol
    x86: drop unused Kconfig symbol
    powerpc: drop unused Kconfig symbols
    powerpc: 40x: drop unused Kconfig symbol
    mips: drop unused Kconfig symbols
    openrisc: drop unused Kconfig symbols
    arm: at91: drop unused Kconfig symbol
    samples: drop unused Kconfig symbol
    m32r: drop unused Kconfig symbol
    score: drop unused Kconfig symbols
    sh: drop unused Kconfig symbol
    um: drop unused Kconfig symbol
    sparc: drop unused Kconfig symbol
    alpha: drop unused Kconfig symbol

    Fix up trivial conflict in drivers/net/ethernet/stmicro/stmmac/Kconfig
    as per Michal: the STMMAC_DUAL_MAC config variable is still unused and
    should be deleted.

    Linus Torvalds
     

06 Nov, 2011

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (45 commits)
    [SCSI] Fix block queue and elevator memory leak in scsi_alloc_sdev
    [SCSI] scsi_dh_alua: Fix the time inteval for alua rtpg commands
    [SCSI] scsi_transport_iscsi: Fix documentation os parameter
    [SCSI] mv_sas: OCZ RevoDrive3 & zDrive R4 support
    [SCSI] libfc: improve flogi retries to avoid lport stuck
    [SCSI] libfc: avoid exchanges collision during lport reset
    [SCSI] libfc: fix checking FC_TYPE_BLS
    [SCSI] edd: Treat "XPRS" host bus type the same as "PCI"
    [SCSI] isci: overriding max_concurr_spinup oem parameter by max(oem, user)
    [SCSI] isci: revert bcn filtering
    [SCSI] isci: Fix hard reset timeout conditions.
    [SCSI] isci: No need to manage the pending reset bit on pending requests.
    [SCSI] isci: Remove redundant isci_request.ttype field.
    [SCSI] isci: Fix task management for SMP, SATA and on dev remove.
    [SCSI] isci: No task_done callbacks in error handler paths.
    [SCSI] isci: Handle task request timeouts correctly.
    [SCSI] isci: Fix tag leak in tasks and terminated requests.
    [SCSI] isci: Immediately fail I/O to removed devices.
    [SCSI] isci: Lookup device references through requests in completions.
    [SCSI] ipr: add definitions for additional adapter
    ...

    Linus Torvalds
     

05 Nov, 2011

1 commit

  • * 'for-3.2/drivers' of git://git.kernel.dk/linux-block: (30 commits)
    virtio-blk: use ida to allocate disk index
    hpsa: add small delay when using PCI Power Management to reset for kump
    cciss: add small delay when using PCI Power Management to reset for kump
    xen/blkback: Fix two races in the handling of barrier requests.
    xen/blkback: Check for proper operation.
    xen/blkback: Fix the inhibition to map pages when discarding sector ranges.
    xen/blkback: Report VBD_WSECT (wr_sect) properly.
    xen/blkback: Support 'feature-barrier' aka old-style BARRIER requests.
    xen-blkfront: plug device number leak in xlblk_init() error path
    xen-blkfront: If no barrier or flush is supported, use invalid operation.
    xen-blkback: use kzalloc() in favor of kmalloc()+memset()
    xen-blkback: fixed indentation and comments
    xen-blkfront: fix a deadlock while handling discard response
    xen-blkfront: Handle discard requests.
    xen-blkback: Implement discard requests ('feature-discard')
    xen-blkfront: add BLKIF_OP_DISCARD and discard request struct
    drivers/block/loop.c: remove unnecessary bdev argument from loop_clr_fd()
    drivers/block/loop.c: emit uevent on auto release
    drivers/block/cpqarray.c: use pci_dev->revision
    loop: always allow userspace partitions and optionally support automatic scanning
    ...

    Fic up trivial header file includsion conflict in drivers/block/loop.c

    Linus Torvalds
     

03 Nov, 2011

3 commits

  • When looking at memory consumption issues I noticed quite a
    lot of memory in the kmalloc-2048 bucket:

    OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
    6561 6471 98% 2.30K 243 27 15552K kmalloc-2048

    Over 15MB. slub debug shows that cfq is responsible for almost
    all of it:

    # sort -nr /sys/kernel/slab/kmalloc-2048/alloc_calls
    6402 .cfq_init_queue+0xec/0x460 age=43423/43564/43655 pid=1 cpus=4,11,13

    In scsi_alloc_sdev we do scsi_alloc_queue but if slave_alloc
    fails we don't free it with scsi_free_queue.

    The patch below fixes the issue:

    OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
    135 72 53% 2.30K 5 27 320K kmalloc-2048

    # cat /sys/kernel/slab/kmalloc-2048/alloc_calls
    3 .cfq_init_queue+0xec/0x460 age=3811/3876/3925 pid=1 cpus=4,11,13

    Signed-off-by: Anton Blanchard
    Cc: #2.6.38+
    Signed-off-by: James Bottomley

    Anton Blanchard
     
  • This patch corrects the retry interval for alua rtpg command. Purpose was to retry the commands in seconds.
    But that was not happening. Reason is msleep takes argument in milliseconds.

    Also added minor text after successful attach.

    Signed-off-by: Babu Moger
    Acked-by: Hannes Reinecke
    Signed-off-by: James Bottomley

    Moger, Babu
     
  • Fixes documentation of a parameter of iscsi_bsg_host_add function to silence
    to make htmldocs

    Signed-off-by: Marcos Paulo de Souza
    Reviewed-by: Mike Christie
    Signed-off-by: James Bottomley

    Marcos Paulo de Souza
     

01 Nov, 2011

2 commits