29 Oct, 2018

1 commit

  • errata:
    When a read command returns less data than specified in the PRDs (for
    example, there are two PRDs for this command, but the device returns a
    number of bytes which is less than in the first PRD), the second PRD of
    this command is not read out of the PRD FIFO, causing the next command
    to use this PRD erroneously.

    workaround
    - forces sg_tablesize = 1
    - modified the sg_io function in block/scsi_ioctl.c to use a 64k buffer
    allocated with dma_alloc_coherent during the probe in ahci_imx
    - In order to fix the scsi/sata hang, when CD_ROM and HDD are
    accessed simultaneously after the workaround is applied.
    Do not go to sleep in scsi_eh_handler, when there is host failed.

    Signed-off-by: Richard Zhu

    Richard Zhu
     

20 Oct, 2018

4 commits

  • [ Upstream commit f1f1fadacaf08b7cf11714c0c29f8fa4d4ef68a9 ]

    When sd_init_command() get's a command with a unknown req_op() it crashes the
    system via BUG().

    This makes debugging the actual reason for the broken request cmd_flags pretty
    hard as the system is down before it's able to write out debugging data on the
    serial console or the trace buffer.

    Change the BUG() to a WARN_ON() and return BLKPREP_KILL to fail gracefully and
    return an I/O error to the producer of the request.

    Signed-off-by: Johannes Thumshirn
    Cc: Hannes Reinecke
    Cc: Bart Van Assche
    Cc: Christoph Hellwig
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Bart Van Assche
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Johannes Thumshirn
     
  • [ Upstream commit 318ddb34b2052f838aa243d07173e2badf3e630e ]

    While dlpar adding primary ipr adapter back, driver goes through adapter
    initialization then schedule ipr_worker_thread to start te disk scan by
    dropping the host lock, calling scsi_add_device. Then get the adapter reset
    request again, so driver does scsi_block_requests, this will cause the
    scsi_add_device get hung until we unblock. But we can't run ipr_worker_thread
    to do the unblock because its stuck in scsi_add_device.

    This patch fixes the issue.

    [mkp: typo and whitespace fixes]

    Signed-off-by: Wen Xiong
    Acked-by: Brian King
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Wen Xiong
     
  • [ Upstream commit adad633af7b970bfa5dd1b624a4afc83cac9b235 ]

    While reviewing another part of the code, Kees noticed that the strncpy of the
    partition name might not always be NUL terminated. Switch to using strscpy
    which does this safely.

    Reported-by: Kees Cook
    Signed-off-by: Laura Abbott
    Reviewed-by: Kees Cook
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Laura Abbott
     
  • [ Upstream commit d792d4c4fc866ae224b0b0ca2aabd87d23b4d6cc ]

    There's currently a warning about string overflow with strncat:

    drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c: In function 'ibmvscsis_probe':
    drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c:3479:2: error: 'strncat' specified
    bound 64 equals destination size [-Werror=stringop-overflow=]
    strncat(vscsi->eye, vdev->name, MAX_EYE);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    Switch to a single snprintf instead of a strcpy + strcat to handle this
    cleanly.

    Signed-off-by: Laura Abbott
    Suggested-by: Kees Cook
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Laura Abbott
     

18 Oct, 2018

1 commit

  • [ Upstream commit cbe3fd39d223f14b1c60c80fe9347a3dd08c2edb ]

    We should first do the le16_to_cpu endian conversion and then apply the
    FCP_CMD_LENGTH_MASK mask.

    Fixes: 5f35509db179 ("qla2xxx: Terminate exchange if corrupted")
    Signed-off-by: Dan Carpenter
    Acked-by: Quinn Tran
    Acked-by: Himanshu Madhani
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Dan Carpenter
     

10 Oct, 2018

2 commits

  • [ Upstream commit c77a2fa3ff8f73d1a485e67e6f81c64823739d59 ]

    The QED driver commit, 1ac4329a1cff ("qed: Add configuration information
    to register dump and debug data"), removes the CRC length validation
    causing nvm_get_image failure while loading qedi driver:

    [qed_mcp_get_nvm_image:2700(host_10-0)]Image [0] is too big - 00006008 bytes
    where only 00006004 are available
    [qedi_get_boot_info:2253]:10: Could not get NVM image. ret = -12

    Hence add and adjust the CRC size to iSCSI NVM image to read boot info at
    qedi load time.

    Signed-off-by: Nilesh Javali
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Nilesh Javali
     
  • [ Upstream commit 89809b028b6f54187b7d81a0c69b35d394c52e62 ]

    Reported-by: Colin Ian King
    Signed-off-by: Varun Prakash
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Varun Prakash
     

04 Oct, 2018

3 commits

  • [ Upstream commit c3b10a55abc943a526aaecd7e860b15671beb906 ]

    There is a possibility that firmware on the controller was upgraded before
    system was suspended. During resume, driver needs to read updated
    controller properties.

    Signed-off-by: Shivasharan S
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Shivasharan S
     
  • [ Upstream commit aa154ea885eb0c2407457ce9c1538d78c95456fa ]

    When ioremap_nocache fails, the lack of error-handling code may cause
    unexpected results.

    This patch adds error-handling code after calling ioremap_nocache.

    Signed-off-by: Zhouyang Jia
    Reviewed-by: Johannes Thumshirn
    Acked-by: Manish Rangankar
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Zhouyang Jia
     
  • [ Upstream commit 1262dc09dc9ae7bf4ad00b6a2c5ed6a6936bcd10 ]

    Currently an open firmware property is copied into partition_name variable
    without keeping a room for \0.

    Later one, this variable (partition_name), which is 97 bytes long, is
    strncpyed into ibmvcsci_host_data->madapter_info->partition_name, which is
    96 bytes long, possibly truncating it 'again' and removing the \0.

    This patch simply decreases the partition name to 96 and just copy using
    strlcpy() which guarantees that the string is \0 terminated. I think there
    is no issue if this there is a truncation in this very first copy, i.e,
    when the open firmware property is read and copied into the driver for the
    very first time;

    This issue also causes the following warning on GCC 8:

    drivers/scsi/ibmvscsi/ibmvscsi.c:281:2: warning: strncpy output may be truncated copying 96 bytes from a string of length 96 [-Wstringop-truncation]
    ...
    inlined from ibmvscsi_probe at drivers/scsi/ibmvscsi/ibmvscsi.c:2221:7:
    drivers/scsi/ibmvscsi/ibmvscsi.c:265:3: warning: strncpy specified bound 97 equals destination size [-Wstringop-truncation]

    CC: Bart Van Assche
    CC: Tyrel Datwyler
    Signed-off-by: Breno Leitao
    Acked-by: Tyrel Datwyler
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Breno Leitao
     

26 Sep, 2018

1 commit

  • [ Upstream commit fa519f701d27198a2858bb108fc18ea9d8c106a7 ]

    fc_rport_login() will be calling mutex_lock() while running inside an
    RCU-protected section, triggering the warning 'sleeping function called
    from invalid context'. To fix this we can drop the rcu functions here
    altogether as the disc mutex protecting the list itself is already held,
    preventing any list manipulation.

    Fixes: a407c593398c ("scsi: libfc: Fixup disc_mutex handling")
    Signed-off-by: Hannes Reinecke
    Acked-by: Johannes Thumshirn
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Hannes Reinecke
     

20 Sep, 2018

2 commits

  • [ Upstream commit 4dc98c1995482262e70e83ef029135247fafe0f2 ]

    tw_probe() returns 0 in case of fail of tw_initialize_device_extension(),
    pci_resource_start() or tw_reset_sequence() and releases resources.
    twl_probe() returns 0 in case of fail of twl_initialize_device_extension(),
    pci_iomap() and twl_reset_sequence(). twa_probe() returns 0 in case of
    fail of tw_initialize_device_extension(), ioremap() and
    twa_reset_sequence().

    The patch adds retval initialization for these cases.

    Found by Linux Driver Verification project (linuxtesting.org).

    Signed-off-by: Anton Vasilyev
    Acked-by: Adam Radford
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Anton Vasilyev
     
  • commit 53e13ee087a80e8d4fc95436318436e5c2c1f8c2 upstream.

    A recent change added some MDS processing in the lpfc_drain_txq routine
    that relies on the fcp_wq being allocated. For nvmet operation the fcp_wq
    is not allocated because it can only be an nvme-target. When the original
    MDS support was added LS_MDS_LOOPBACK was defined wrong, (0x16) it should
    have been 0x10 (decimal value used for hex setting). This incorrect value
    allowed MDS_LOOPBACK to be set simultaneously with LS_NPIV_FAB_SUPPORTED,
    causing the driver to crash when it accesses the non-existent fcp_wq.

    Correct the bad value setting for LS_MDS_LOOPBACK.

    Fixes: ae9e28f36a6c ("lpfc: Add MDS Diagnostic support.")
    Cc: # v4.12+
    Signed-off-by: Dick Kennedy
    Signed-off-by: James Smart
    Tested-by: Ewan D. Milne
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    James Smart
     

15 Sep, 2018

1 commit

  • [ Upstream commit 0756c57bce3d26da2592d834d8910b6887021701 ]

    We accidentally return success instead of -ENOMEM on this error path.

    Fixes: 2908d778ab3e ("[SCSI] aic94xx: new driver")
    Signed-off-by: Dan Carpenter
    Reviewed-by: Johannes Thumshirn
    Reviewed-by: John Garry
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Dan Carpenter
     

05 Sep, 2018

7 commits

  • commit 0ee223b2e1f67cb2de9c0e3247c510d846e74d63 upstream.

    A long time ago the unfortunate decision was taken to add a self-deletion
    attribute to the sysfs SCSI device directory. That decision was unfortunate
    because self-deletion is really tricky. We can't drop that attribute
    because widely used user space software depends on it, namely the
    rescan-scsi-bus.sh script. Hence this patch that avoids that writing into
    that attribute triggers a deadlock. See also commit 7973cbd9fbd9 ("[PATCH]
    add sysfs attributes to scan and delete scsi_devices").

    This patch avoids that self-removal triggers the following deadlock:

    ======================================================
    WARNING: possible circular locking dependency detected
    4.18.0-rc2-dbg+ #5 Not tainted
    ------------------------------------------------------
    modprobe/6539 is trying to acquire lock:
    000000008323c4cd (kn->count#202){++++}, at: kernfs_remove_by_name_ns+0x45/0x90

    but task is already holding lock:
    00000000a6ec2c69 (&shost->scan_mutex){+.+.}, at: scsi_remove_host+0x21/0x150 [scsi_mod]

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #1 (&shost->scan_mutex){+.+.}:
    __mutex_lock+0xfe/0xc70
    mutex_lock_nested+0x1b/0x20
    scsi_remove_device+0x26/0x40 [scsi_mod]
    sdev_store_delete+0x27/0x30 [scsi_mod]
    dev_attr_store+0x3e/0x50
    sysfs_kf_write+0x87/0xa0
    kernfs_fop_write+0x190/0x230
    __vfs_write+0xd2/0x3b0
    vfs_write+0x101/0x270
    ksys_write+0xab/0x120
    __x64_sys_write+0x43/0x50
    do_syscall_64+0x77/0x230
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    -> #0 (kn->count#202){++++}:
    lock_acquire+0xd2/0x260
    __kernfs_remove+0x424/0x4a0
    kernfs_remove_by_name_ns+0x45/0x90
    remove_files.isra.1+0x3a/0x90
    sysfs_remove_group+0x5c/0xc0
    sysfs_remove_groups+0x39/0x60
    device_remove_attrs+0x82/0xb0
    device_del+0x251/0x580
    __scsi_remove_device+0x19f/0x1d0 [scsi_mod]
    scsi_forget_host+0x37/0xb0 [scsi_mod]
    scsi_remove_host+0x9b/0x150 [scsi_mod]
    sdebug_driver_remove+0x4b/0x150 [scsi_debug]
    device_release_driver_internal+0x241/0x360
    device_release_driver+0x12/0x20
    bus_remove_device+0x1bc/0x290
    device_del+0x259/0x580
    device_unregister+0x1a/0x70
    sdebug_remove_adapter+0x8b/0xf0 [scsi_debug]
    scsi_debug_exit+0x76/0xe8 [scsi_debug]
    __x64_sys_delete_module+0x1c1/0x280
    do_syscall_64+0x77/0x230
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    other info that might help us debug this:

    Possible unsafe locking scenario:

    CPU0 CPU1
    ---- ----
    lock(&shost->scan_mutex);
    lock(kn->count#202);
    lock(&shost->scan_mutex);
    lock(kn->count#202);

    *** DEADLOCK ***

    2 locks held by modprobe/6539:
    #0: 00000000efaf9298 (&dev->mutex){....}, at: device_release_driver_internal+0x68/0x360
    #1: 00000000a6ec2c69 (&shost->scan_mutex){+.+.}, at: scsi_remove_host+0x21/0x150 [scsi_mod]

    stack backtrace:
    CPU: 10 PID: 6539 Comm: modprobe Not tainted 4.18.0-rc2-dbg+ #5
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
    Call Trace:
    dump_stack+0xa4/0xf5
    print_circular_bug.isra.34+0x213/0x221
    __lock_acquire+0x1a7e/0x1b50
    lock_acquire+0xd2/0x260
    __kernfs_remove+0x424/0x4a0
    kernfs_remove_by_name_ns+0x45/0x90
    remove_files.isra.1+0x3a/0x90
    sysfs_remove_group+0x5c/0xc0
    sysfs_remove_groups+0x39/0x60
    device_remove_attrs+0x82/0xb0
    device_del+0x251/0x580
    __scsi_remove_device+0x19f/0x1d0 [scsi_mod]
    scsi_forget_host+0x37/0xb0 [scsi_mod]
    scsi_remove_host+0x9b/0x150 [scsi_mod]
    sdebug_driver_remove+0x4b/0x150 [scsi_debug]
    device_release_driver_internal+0x241/0x360
    device_release_driver+0x12/0x20
    bus_remove_device+0x1bc/0x290
    device_del+0x259/0x580
    device_unregister+0x1a/0x70
    sdebug_remove_adapter+0x8b/0xf0 [scsi_debug]
    scsi_debug_exit+0x76/0xe8 [scsi_debug]
    __x64_sys_delete_module+0x1c1/0x280
    do_syscall_64+0x77/0x230
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    See also https://www.mail-archive.com/linux-scsi@vger.kernel.org/msg54525.html.

    Fixes: ac0ece9174ac ("scsi: use device_remove_file_self() instead of device_schedule_callback()")
    Signed-off-by: Bart Van Assche
    Cc: Greg Kroah-Hartman
    Acked-by: Tejun Heo
    Cc: Johannes Thumshirn
    Cc:
    Signed-off-by: Greg Kroah-Hartman

    Signed-off-by: Martin K. Petersen

    Bart Van Assche
     
  • commit 91b7bdb2c0089cbbb817df6888ab1458c645184e upstream.

    This patch avoids that smatch complains about a double unlock on
    ioc->transport_cmds.mutex.

    Fixes: 651a01364994 ("scsi: scsi_transport_sas: switch to bsg-lib for SMP passthrough")
    Signed-off-by: Bart Van Assche
    Cc: Christoph Hellwig
    Cc: Sathya Prakash
    Cc: Chaitra P B
    Cc: Suganath Prabu Subramani
    Cc: stable@vger.kernel.org
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Bart Van Assche
     
  • [ Upstream commit e95153b64d03c2b6e8d62e51bdcc33fcad6e0856 ]

    Commands that are reset are returned with status
    SAM_STAT_COMMAND_TERMINATED. PVSCSI currently returns DID_OK |
    SAM_STAT_COMMAND_TERMINATED which fails the command. Instead, set hostbyte
    to DID_RESET to allow upper layers to retry.

    Tested by copying a large file between two pvscsi disks on same adapter
    while performing a bus reset at 1-second intervals. Before fix, commands
    sometimes fail with DID_OK. After fix, commands observed to fail with
    DID_RESET.

    Signed-off-by: Jim Gill
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Jim Gill
     
  • [ Upstream commit 1550ec458e0cf1a40a170ab1f4c46e3f52860f65 ]

    When receiving a LOGO request we forget to clear the FC_RP_STARTED flag
    before starting the rport delete routine.

    As the started flag was not cleared, we're not deleting the rport but
    waiting for a restart and thus are keeping the reference count of the rdata
    object at 1.

    This leads to the following kmemleak report:
    unreferenced object 0xffff88006542aa00 (size 512):
    comm "kworker/0:2", pid 24, jiffies 4294899222 (age 226.880s)
    hex dump (first 32 bytes):
    68 96 fe 65 00 88 ff ff 00 00 00 00 00 00 00 00 h..e............
    01 00 00 00 08 00 00 00 02 c5 45 24 ac b8 00 10 ..........E$....
    backtrace:
    [] fcoe_ctlr_vn_add.isra.5+0x7f/0x770 [libfcoe]
    [] fcoe_ctlr_vn_recv+0x12af/0x27f0 [libfcoe]
    [] fcoe_ctlr_recv_work+0xd01/0x32f0 [libfcoe]
    [] process_one_work+0x7ff/0x1420
    [] worker_thread+0x87/0xef0
    [] kthread+0x2db/0x390
    [] ret_from_fork+0x35/0x40
    [] 0xffffffffffffffff

    Signed-off-by: Johannes Thumshirn
    Reported-by: ard
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Johannes Thumshirn
     
  • [ Upstream commit 63d0e3dffda311e77b9a8c500d59084e960a824a ]

    Drop the frames in the ELS LOGO error path instead of just returning an
    error.

    This fixes the following kmemleak report:
    unreferenced object 0xffff880064cb1000 (size 424):
    comm "kworker/0:2", pid 24, jiffies 4294904293 (age 68.504s)
    hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    backtrace:
    [] _fc_frame_alloc+0x2c/0x180 [libfc]
    [] fc_lport_enter_logo+0x106/0x360 [libfc]
    [] fc_fabric_logoff+0x8c/0xc0 [libfc]
    [] fcoe_if_destroy+0x79/0x3b0 [fcoe]
    [] fcoe_destroy_work+0xd2/0x170 [fcoe]
    [] process_one_work+0x7ff/0x1420
    [] worker_thread+0x87/0xef0
    [] kthread+0x2db/0x390
    [] ret_from_fork+0x35/0x40
    [] 0xffffffffffffffff

    which can be triggered by issuing
    echo eth0 > /sys/bus/fcoe/ctlr_destroy

    Signed-off-by: Johannes Thumshirn
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Johannes Thumshirn
     
  • [ Upstream commit 2d7d4fd35e6e15b47c13c70368da83add19f01e7 ]

    KASAN reports a use-after-free in fcoe_ctlr_els_send() when we're sending a
    LOGO and have FIP debugging enabled. This is because we're first freeing
    the skb and then printing the frame's DID. But the DID is a member of the
    FC frame header which in turn is the skb's payload.

    Exchange the debug print and kfree_skb() calls so we're not touching the
    freed data.

    Signed-off-by: Johannes Thumshirn
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Johannes Thumshirn
     
  • [ Upstream commit a17037e7d59075053b522048742a08ac9500bde8 ]

    In iscsi_check_tmf_restrictions() task->hdr is dereferenced to print the
    opcode, it is possible that task->hdr is NULL.

    There are two cases based on opcode argument:

    1. ISCSI_OP_SCSI_CMD - In this case alloc_pdu() is called
    after iscsi_check_tmf_restrictions()

    iscsi_prep_scsi_cmd_pdu() -> iscsi_check_tmf_restrictions() -> alloc_pdu().

    Transport drivers allocate memory for iSCSI hdr in alloc_pdu() and assign
    it to task->hdr. In case of TMF task->hdr will be NULL resulting in NULL
    pointer dereference.

    2. ISCSI_OP_SCSI_DATA_OUT - In this case transport driver can free the
    memory for iSCSI hdr after transmitting the pdu so task->hdr can be NULL or
    invalid.

    This patch fixes this issue by removing task->hdr->opcode from the printk
    statement.

    Signed-off-by: Varun Prakash
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Varun Prakash
     

24 Aug, 2018

3 commits

  • [ Upstream commit a3440d0d2f57f7ba102fc332086961cf261180af ]

    In case of iSCSI offload BFS environment, MFW requires to mark virtual
    link based upon qedi load status.

    Signed-off-by: Manish Rangankar
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Manish Rangankar
     
  • [ Upstream commit 6ac174756dfc9884f08b23af840ca911155f5578 ]

    Need to notify firmware when driver is loaded and unloaded.

    Signed-off-by: Saurav Kashyap
    Signed-off-by: Chad Dupuis
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Saurav Kashyap
     
  • [ Upstream commit 93efbd39870474cc536b9caf4a6efeb03b0bc56f ]

    When xenbus_printf fails, the lack of error-handling code may
    cause unexpected results.

    This patch adds error-handling code after calling xenbus_printf.

    Signed-off-by: Zhouyang Jia
    Reviewed-by: Juergen Gross
    Signed-off-by: Juergen Gross
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Zhouyang Jia
     

16 Aug, 2018

5 commits

  • commit 5e53be8e476a3397ed5383c23376f299555a2b43 upstream.

    In the case of IOCB QFull, Initiator code can leave behind a stale pointer
    to an SRB structure on the outstanding command array.

    Fixes: 82de802ad46e ("scsi: qla2xxx: Preparation for Target MQ.")
    Cc: stable@vger.kernel.org #v4.16+
    Signed-off-by: Quinn Tran
    Signed-off-by: Himanshu Madhani
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Quinn Tran
     
  • commit 1214fd7b497400d200e3f4e64e2338b303a20949 upstream.

    Surround scsi_execute() calls with scsi_autopm_get_device() and
    scsi_autopm_put_device(). Note: removing sr_mutex protection from the
    scsi_cd_get() and scsi_cd_put() calls is safe because the purpose of
    sr_mutex is to serialize cdrom_*() calls.

    This patch avoids that complaints similar to the following appear in the
    kernel log if runtime power management is enabled:

    INFO: task systemd-udevd:650 blocked for more than 120 seconds.
    Not tainted 4.18.0-rc7-dbg+ #1
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    systemd-udevd D28176 650 513 0x00000104
    Call Trace:
    __schedule+0x444/0xfe0
    schedule+0x4e/0xe0
    schedule_preempt_disabled+0x18/0x30
    __mutex_lock+0x41c/0xc70
    mutex_lock_nested+0x1b/0x20
    __blkdev_get+0x106/0x970
    blkdev_get+0x22c/0x5a0
    blkdev_open+0xe9/0x100
    do_dentry_open.isra.19+0x33e/0x570
    vfs_open+0x7c/0xd0
    path_openat+0x6e3/0x1120
    do_filp_open+0x11c/0x1c0
    do_sys_open+0x208/0x2d0
    __x64_sys_openat+0x59/0x70
    do_syscall_64+0x77/0x230
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    Signed-off-by: Bart Van Assche
    Cc: Maurizio Lombardi
    Cc: Johannes Thumshirn
    Cc: Alan Stern
    Cc:
    Tested-by: Johannes Thumshirn
    Reviewed-by: Johannes Thumshirn
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Bart Van Assche
     
  • commit b5b6e8c8d3b4cbeb447a0f10c7d5de3caa573299 upstream.

    Since commit 84676c1f21e8ff5 ("genirq/affinity: assign vectors to all
    possible CPUs") it is possible to end up in a scenario where only
    offline CPUs are mapped to an interrupt vector.

    This is only an issue for the legacy I/O path since with blk-mq/scsi-mq
    an I/O can't be submitted to a hardware queue if the queue isn't mapped
    to an online CPU.

    Fix this issue by forcing virtio-scsi to use blk-mq.

    [mkp: commit desc]

    Cc: Omar Sandoval ,
    Cc: "Martin K. Petersen" ,
    Cc: James Bottomley ,
    Cc: Christoph Hellwig ,
    Cc: Don Brace
    Cc: Kashyap Desai
    Cc: Paolo Bonzini
    Cc: Mike Snitzer
    Cc: Laurence Oberman
    Fixes: 84676c1f21e8 ("genirq/affinity: assign vectors to all possible CPUs")
    Signed-off-by: Ming Lei
    Reviewed-by: Hannes Reinecke
    Acked-by: Paolo Bonzini
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Ming Lei
     
  • commit 2f31115e940c4afd49b99c33123534e2ac924ffb upstream.

    This patch introduces 'force_blk_mq' to the scsi_host_template so that
    drivers that have no desire to support the legacy I/O path can signal
    blk-mq only support.

    [mkp: commit desc]

    Cc: Omar Sandoval ,
    Cc: "Martin K. Petersen" ,
    Cc: James Bottomley ,
    Cc: Christoph Hellwig ,
    Cc: Don Brace
    Cc: Kashyap Desai
    Cc: Mike Snitzer
    Cc: Laurence Oberman
    Signed-off-by: Ming Lei
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Ming Lei
     
  • commit 8b834bff1b73dce46f4e9f5e84af6f73fed8b0ef upstream.

    Since commit 84676c1f21e8 ("genirq/affinity: assign vectors to all
    possible CPUs") we could end up with an MSI-X vector that did not have
    any online CPUs mapped. This would lead to I/O hangs since there was no
    CPU to receive the completion.

    Retrieve IRQ affinity information using pci_irq_get_affinity() and use
    this mapping to choose a reply queue.

    [mkp: tweaked commit desc]

    Cc: Hannes Reinecke
    Cc: "Martin K. Petersen" ,
    Cc: James Bottomley ,
    Cc: Christoph Hellwig ,
    Cc: Don Brace
    Cc: Kashyap Desai
    Cc: Laurence Oberman
    Cc: Meelis Roos
    Cc: Artem Bityutskiy
    Cc: Mike Snitzer
    Fixes: 84676c1f21e8 ("genirq/affinity: assign vectors to all possible CPUs")
    Signed-off-by: Ming Lei
    Tested-by: Laurence Oberman
    Tested-by: Don Brace
    Tested-by: Artem Bityutskiy
    Acked-by: Don Brace
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Ming Lei
     

09 Aug, 2018

4 commits

  • commit b4146c4929ef61d5afca011474d59d0918a0cd82 upstream.

    Propagate the task management completion status properly to avoid
    unnecessary waits for commands to complete.

    Fixes: faef62d13463 ("[SCSI] qla2xxx: Fix Task Management command asynchronous handling")
    Cc:
    Signed-off-by: Anil Gurumurthy
    Signed-off-by: Himanshu Madhani
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Anil Gurumurthy
     
  • commit b08abbd9f5996309f021684f9ca74da30dcca36a upstream.

    During unload process, the chip can encounter problem where a FW dump would
    be captured. For this case, the full reset sequence will be skip to bring
    the chip back to full operational state.

    Fixes: e315cd28b9ef ("[SCSI] qla2xxx: Code changes for qla data structure refactoring")
    Cc:
    Signed-off-by: Quinn Tran
    Signed-off-by: Himanshu Madhani
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Quinn Tran
     
  • commit efa93f48fa9d423fda166bc3b6c0cbb09682492e upstream.

    Add wait for session deletion to finish before freeing an NPIV scsi host.

    Fixes: 726b85487067 ("qla2xxx: Add framework for async fabric discovery")
    Cc:
    Signed-off-by: Quinn Tran
    Signed-off-by: Himanshu Madhani
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Quinn Tran
     
  • commit e3dde080ebbdbb4bda8eee35d770714fee8c59ac upstream.

    In case of IOCB Queue full or system where memory is low and driver
    receives large number of RSCN storm, the stale sp pointer can stay on
    gpnid_list resulting in page_fault.

    This patch fixes this issue by initializing the sp->elem list head and
    removing sp->elem before memory is freed.

    Following stack trace is seen

    9 [ffff987b37d1bc60] page_fault at ffffffffad516768 [exception RIP: qla24xx_async_gpnid+496]
    10 [ffff987b37d1bd10] qla24xx_async_gpnid at ffffffffc039866d [qla2xxx]
    11 [ffff987b37d1bd80] qla2x00_do_work at ffffffffc036169c [qla2xxx]
    12 [ffff987b37d1be38] qla2x00_do_dpc_all_vps at ffffffffc03adfed [qla2xxx]
    13 [ffff987b37d1be78] qla2x00_do_dpc at ffffffffc036458a [qla2xxx]
    14 [ffff987b37d1bec8] kthread at ffffffffacebae31

    Fixes: 2d73ac6102d9 ("scsi: qla2xxx: Serialize GPNID for multiple RSCN")
    Cc: # v4.17+
    Signed-off-by: Quinn Tran
    Signed-off-by: Himanshu Madhani
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Quinn Tran
     

06 Aug, 2018

1 commit

  • commit c170e5a8d222537e98aa8d4fddb667ff7a2ee114 upstream.

    Fix a minor memory leak when there is an error opening a /dev/sg device.

    Fixes: cc833acbee9d ("sg: O_EXCL and other lock handling")
    Cc:
    Reviewed-by: Ewan D. Milne
    Signed-off-by: Tony Battersby
    Reviewed-by: Bart Van Assche
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Tony Battersby
     

03 Aug, 2018

5 commits

  • [ Upstream commit 465891fe9237b02f8d0fd26448f733fae7236f4a ]

    The SISLite specification originally defined the context control register with
    a single field of bits to represent the LISN and also stipulated that the
    register reset value be 0. The cxlflash driver took advantage of this when
    programming the LISN for the master contexts via an unconditional write - no
    other bits were preserved.

    When unmap support was added, SISLite was updated to define bit 0 of the
    context control register as a way for the AFU to notify the context owner that
    unmap operations were supported. Thus the assumptions under which the register
    is setup changed and the existing unconditional write is clobbering the unmap
    state for master contexts. This is presently not an issue due to the order in
    which the context control register is programmed in relation to the unmap bit
    being queried but should be addressed to avoid a future regression in the
    event this code is moved elsewhere.

    To remedy this issue, preserve the bits when programming the LISN field in the
    context control register. Since the LISN will now be programmed using a read
    value, assert that the initial state of the LISN field is as described in
    SISLite (0).

    Signed-off-by: Matthew R. Ochs
    Signed-off-by: Uma Krishnan
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Matthew R. Ochs
     
  • [ Upstream commit a3feb6ef50def7c91244d7bd15a3625b7b49b81f ]

    The following Oops can be encountered if a device removal or system shutdown
    is initiated while an EEH recovery is in process:

    [c000000ff2f479c0] c008000015256f18 cxlflash_pci_slot_reset+0xa0/0x100
    [cxlflash]
    [c000000ff2f47a30] c00800000dae22e0 cxl_pci_slot_reset+0x168/0x290 [cxl]
    [c000000ff2f47ae0] c00000000003ef1c eeh_report_reset+0xec/0x170
    [c000000ff2f47b20] c00000000003d0b8 eeh_pe_dev_traverse+0x98/0x170
    [c000000ff2f47bb0] c00000000003f80c eeh_handle_normal_event+0x56c/0x580
    [c000000ff2f47c60] c00000000003fba4 eeh_handle_event+0x2a4/0x338
    [c000000ff2f47d10] c0000000000400b8 eeh_event_handler+0x1f8/0x200
    [c000000ff2f47dc0] c00000000013da48 kthread+0x1a8/0x1b0
    [c000000ff2f47e30] c00000000000b528 ret_from_kernel_thread+0x5c/0xb4

    The remove handler frees AFU memory while the EEH recovery is in progress,
    leading to a race condition. This can result in a crash if the recovery thread
    tries to access this memory.

    To resolve this issue, the cxlflash remove handler will evaluate the device
    state and yield to any active reset or probing threads.

    Signed-off-by: Uma Krishnan
    Acked-by: Matthew R. Ochs
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Uma Krishnan
     
  • [ Upstream commit 3239b8cd28fd849a2023483257d35d68c5876c74 ]

    Hardware could time out Fastpath IOs one second earlier than the timeout
    provided by the host.

    For non-RAID devices, driver provides timeout value based on OS provided
    timeout value. Under certain scenarios, if the OS provides a timeout
    value of 1 second, due to above behavior hardware will timeout
    immediately.

    Increase timeout value for non-RAID fastpath IOs by 1 second.

    Signed-off-by: Shivasharan S
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Shivasharan S
     
  • [ Upstream commit 37b37d2609cb0ac267280ef27350b962d16d272e ]

    SGI/TP9100 is not an RDAC array:
    ^^^
    https://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=blob;f=libmultipath/hwtable.c;h=88b4700beb1d8940008020fbe4c3cd97d62f4a56;hb=HEAD#l235

    This partially reverts commit 35204772ea03 ("[SCSI] scsi_dh_rdac :
    Consolidate rdac strings together")

    [mkp: fixed up the new entries to align with rest of struct]

    Cc: NetApp RDAC team
    Cc: Hannes Reinecke
    Cc: James E.J. Bottomley
    Cc: Martin K. Petersen
    Cc: SCSI ML
    Cc: DM ML
    Signed-off-by: Xose Vazquez Perez
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Xose Vazquez Perez
     
  • [ Upstream commit 4f4616ceebaf045c59e8a6aa01f08826d18d5c63 ]

    Similar to what we do when we remove a PCI function, set the
    QEDF_UNLOADING flag to prevent any requests from being queued while a
    vport is being deleted. This prevents any requests from getting stuck
    in limbo when the vport is unloaded or deleted.

    Fixes the crash:

    PID: 106676 TASK: ffff9a436aa90000 CPU: 12 COMMAND: "multipathd"
    #0 [ffff9a43567d3550] machine_kexec+522 at ffffffffaca60b2a
    #1 [ffff9a43567d35b0] __crash_kexec+114 at ffffffffacb13512
    #2 [ffff9a43567d3680] crash_kexec+48 at ffffffffacb13600
    #3 [ffff9a43567d3698] oops_end+168 at ffffffffad117768
    #4 [ffff9a43567d36c0] no_context+645 at ffffffffad106f52
    #5 [ffff9a43567d3710] __bad_area_nosemaphore+116 at ffffffffad106fe9
    #6 [ffff9a43567d3760] bad_area+70 at ffffffffad107379
    #7 [ffff9a43567d3788] __do_page_fault+1247 at ffffffffad11a8cf
    #8 [ffff9a43567d37f0] do_page_fault+53 at ffffffffad11a915
    #9 [ffff9a43567d3820] page_fault+40 at ffffffffad116768
    [exception RIP: qedf_init_task+61]
    RIP: ffffffffc0e13c2d RSP: ffff9a43567d38d0 RFLAGS: 00010046
    RAX: 0000000000000000 RBX: ffffbe920472c738 RCX: ffff9a434fa0e3e8
    RDX: ffff9a434f695280 RSI: ffffbe920472c738 RDI: ffff9a43aa359c80
    RBP: ffff9a43567d3950 R8: 0000000000000c15 R9: ffff9a3fb09b9880
    R10: ffff9a434fa0e3e8 R11: ffff9a43567d35ce R12: 0000000000000000
    R13: ffff9a434f695280 R14: ffff9a43aa359c80 R15: ffff9a3fb9e005c0
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018

    Signed-off-by: Chad Dupuis
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Chad Dupuis