12 Feb, 2019

1 commit

  • errata:
    When a read command returns less data than specified in the PRDs (for
    example, there are two PRDs for this command, but the device returns a
    number of bytes which is less than in the first PRD), the second PRD of
    this command is not read out of the PRD FIFO, causing the next command
    to use this PRD erroneously.

    workaround
    - forces sg_tablesize = 1
    - modified the sg_io function in block/scsi_ioctl.c to use a 64k buffer
    allocated with dma_alloc_coherent during the probe in ahci_imx
    - In order to fix the scsi/sata hang, when CD_ROM and HDD are
    accessed simultaneously after the workaround is applied.
    Do not go to sleep in scsi_eh_handler, when there is host failed.

    Signed-off-by: Richard Zhu

    Richard Zhu
     

26 Jan, 2019

3 commits

  • [ Upstream commit c7a082e4242fd8cd21a441071e622f87c16bdacc ]

    UBSAN reported those with MegaRAID SAS-3 3108,

    [ 77.467308] UBSAN: Undefined behaviour in drivers/scsi/megaraid/megaraid_sas_fp.c:117:32
    [ 77.475402] index 255 is out of range for type 'MR_LD_SPAN_MAP [1]'
    [ 77.481677] CPU: 16 PID: 333 Comm: kworker/16:1 Not tainted 4.20.0-rc5+ #1
    [ 77.488556] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018
    [ 77.495791] Workqueue: events work_for_cpu_fn
    [ 77.500154] Call trace:
    [ 77.502610] dump_backtrace+0x0/0x2c8
    [ 77.506279] show_stack+0x24/0x30
    [ 77.509604] dump_stack+0x118/0x19c
    [ 77.513098] ubsan_epilogue+0x14/0x60
    [ 77.516765] __ubsan_handle_out_of_bounds+0xfc/0x13c
    [ 77.521767] mr_update_load_balance_params+0x150/0x158 [megaraid_sas]
    [ 77.528230] MR_ValidateMapInfo+0x2cc/0x10d0 [megaraid_sas]
    [ 77.533825] megasas_get_map_info+0x244/0x2f0 [megaraid_sas]
    [ 77.539505] megasas_init_adapter_fusion+0x9b0/0xf48 [megaraid_sas]
    [ 77.545794] megasas_init_fw+0x1ab4/0x3518 [megaraid_sas]
    [ 77.551212] megasas_probe_one+0x2c4/0xbe0 [megaraid_sas]
    [ 77.556614] local_pci_probe+0x7c/0xf0
    [ 77.560365] work_for_cpu_fn+0x34/0x50
    [ 77.564118] process_one_work+0x61c/0xf08
    [ 77.568129] worker_thread+0x534/0xa70
    [ 77.571882] kthread+0x1c8/0x1d0
    [ 77.575114] ret_from_fork+0x10/0x1c

    [ 89.240332] UBSAN: Undefined behaviour in drivers/scsi/megaraid/megaraid_sas_fp.c:117:32
    [ 89.248426] index 255 is out of range for type 'MR_LD_SPAN_MAP [1]'
    [ 89.254700] CPU: 16 PID: 95 Comm: kworker/u130:0 Not tainted 4.20.0-rc5+ #1
    [ 89.261665] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018
    [ 89.268903] Workqueue: events_unbound async_run_entry_fn
    [ 89.274222] Call trace:
    [ 89.276680] dump_backtrace+0x0/0x2c8
    [ 89.280348] show_stack+0x24/0x30
    [ 89.283671] dump_stack+0x118/0x19c
    [ 89.287167] ubsan_epilogue+0x14/0x60
    [ 89.290835] __ubsan_handle_out_of_bounds+0xfc/0x13c
    [ 89.295828] MR_LdRaidGet+0x50/0x58 [megaraid_sas]
    [ 89.300638] megasas_build_io_fusion+0xbb8/0xd90 [megaraid_sas]
    [ 89.306576] megasas_build_and_issue_cmd_fusion+0x138/0x460 [megaraid_sas]
    [ 89.313468] megasas_queue_command+0x398/0x3d0 [megaraid_sas]
    [ 89.319222] scsi_dispatch_cmd+0x1dc/0x8a8
    [ 89.323321] scsi_request_fn+0x8e8/0xdd0
    [ 89.327249] __blk_run_queue+0xc4/0x158
    [ 89.331090] blk_execute_rq_nowait+0xf4/0x158
    [ 89.335449] blk_execute_rq+0xdc/0x158
    [ 89.339202] __scsi_execute+0x130/0x258
    [ 89.343041] scsi_probe_and_add_lun+0x2fc/0x1488
    [ 89.347661] __scsi_scan_target+0x1cc/0x8c8
    [ 89.351848] scsi_scan_channel.part.3+0x8c/0xc0
    [ 89.356382] scsi_scan_host_selected+0x130/0x1f0
    [ 89.361002] do_scsi_scan_host+0xd8/0xf0
    [ 89.364927] do_scan_async+0x9c/0x320
    [ 89.368594] async_run_entry_fn+0x138/0x420
    [ 89.372780] process_one_work+0x61c/0xf08
    [ 89.376793] worker_thread+0x13c/0xa70
    [ 89.380546] kthread+0x1c8/0x1d0
    [ 89.383778] ret_from_fork+0x10/0x1c

    This is because when populating Driver Map using firmware raid map, all
    non-existing VDs set their ldTgtIdToLd to 0xff, so it can be skipped later.

    From drivers/scsi/megaraid/megaraid_sas_base.c ,
    memset(instance->ld_ids, 0xff, MEGASAS_MAX_LD_IDS);

    From drivers/scsi/megaraid/megaraid_sas_fp.c ,
    /* For non existing VDs, iterate to next VD*/
    if (ld >= (MAX_LOGICAL_DRIVES_EXT - 1))
    continue;

    However, there are a few places that failed to skip those non-existing VDs
    due to off-by-one errors. Then, those 0xff leaked into MR_LdRaidGet(0xff,
    map) and triggered the out-of-bound accesses.

    Fixes: 51087a8617fe ("megaraid_sas : Extended VD support")
    Signed-off-by: Qian Cai
    Acked-by: Sumit Saxena
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin

    Qian Cai
     
  • [ Upstream commit e57b2945aa654e48f85a41e8917793c64ecb9de8 ]

    We must free all irqs during shutdown, else kexec's 2nd kernel would hang
    in pqi_wait_for_completion_io() as below:

    Call trace:

    pqi_wait_for_completion_io
    pqi_submit_raid_request_synchronous.constprop.78+0x23c/0x310 [smartpqi]
    pqi_configure_events+0xec/0x1f8 [smartpqi]
    pqi_ctrl_init+0x814/0xca0 [smartpqi]
    pqi_pci_probe+0x400/0x46c [smartpqi]
    local_pci_probe+0x48/0xb0
    pci_device_probe+0x14c/0x1b0
    really_probe+0x218/0x3fc
    driver_probe_device+0x70/0x140
    __driver_attach+0x11c/0x134
    bus_for_each_dev+0x70/0xc8
    driver_attach+0x30/0x38
    bus_add_driver+0x1f0/0x294
    driver_register+0x74/0x12c
    __pci_register_driver+0x64/0x70
    pqi_init+0xd0/0x10000 [smartpqi]
    do_one_initcall+0x60/0x1d8
    do_init_module+0x64/0x1f8
    load_module+0x10ec/0x1350
    __se_sys_finit_module+0xd4/0x100
    __arm64_sys_finit_module+0x28/0x34
    el0_svc_handler+0x104/0x160
    el0_svc+0x8/0xc

    This happens only in the following combinations:

    1. smartpqi is built as module, not built-in;
    2. We have a disk connected to smartpqi card;
    3. Both kexec's 1st and 2nd kernels use this disk as Rootfs' mount point.

    Signed-off-by: Yanjiang Jin
    Acked-by: Don Brace
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin

    Yanjiang Jin
     
  • [ Upstream commit 2ba55c9851d74eb015a554ef69ddf2ef061d5780 ]

    Problem:
    The Linux kernel takes a logical volume offline after a LUN reset. This is
    generally accompanied by this message in the dmesg output:

    Device offlined - not ready after error recovery

    Root Cause:
    The root cause is a "quirk" in the timeout handling in the Linux SCSI
    layer. The Linux kernel places a 30-second timeout on most media access
    commands (reads and writes) that it send to device drivers. When a media
    access command times out, the Linux kernel goes into error recovery mode
    for the LUN that was the target of the command that timed out. Every
    command that timed out is kept on a list inside of the Linux kernel to be
    retried later. The kernel attempts to recover the command(s) that timed out
    by issuing a LUN reset followed by a TEST UNIT READY. If the LUN reset and
    TEST UNIT READY commands are successful, the kernel retries the command(s)
    that timed out.

    Each SCSI command issued by the kernel has a result field associated with
    it. This field indicates the final result of the command (success or
    error). When a command times out, the kernel places a value in this result
    field indicating that the command timed out.

    The "quirk" is that after the LUN reset and TEST UNIT READY commands are
    completed, the kernel checks each command on the timed-out command list
    before retrying it. If the result field is still "timed out", the kernel
    treats that command as not having been successfully recovered for a
    retry. If the number of commands that are in this state are greater than
    two, the kernel takes the LUN offline.

    Fix:
    When our RAIDStack receives a LUN reset, it simply waits until all
    outstanding commands complete. Generally, all of these outstanding commands
    complete successfully. Therefore, the fix in the smartpqi driver is to
    always set the command result field to indicate success when a request
    completes successfully. This normally isn’t necessary because the result
    field is always initialized to success when the command is submitted to the
    driver. So when the command completes successfully, the result field is
    left untouched. But in this case, the kernel changes the result field
    behind the driver’s back and then expects the field to be changed by the
    driver as the commands that timed-out complete.

    Reviewed-by: Dave Carroll
    Reviewed-by: Scott Teel
    Signed-off-by: Kevin Barnett
    Signed-off-by: Don Brace
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin

    Kevin Barnett
     

23 Jan, 2019

2 commits

  • commit 44759979a49bfd2d20d789add7fa81a21eb1a4ab upstream.

    Changing of caching mode via /sys/devices/.../scsi_disk/.../cache_type may
    fail if device responds to MODE SENSE command with DPOFUA flag set, and
    then checks this flag to be not set on MODE SELECT command.

    In this scenario, when trying to change cache_type, write always fails:

    # echo "none" >cache_type
    bash: echo: write error: Invalid argument

    And following appears in dmesg:

    [13007.865745] sd 1:0:1:0: [sda] Sense Key : Illegal Request [current]
    [13007.865753] sd 1:0:1:0: [sda] Add. Sense: Invalid field in parameter list

    From SBC-4 r15, 6.5.1 "Mode pages overview", description of DEVICE-SPECIFIC
    PARAMETER field in the mode parameter header:
    ...
    The write protect (WP) bit for mode data sent with a MODE SELECT
    command shall be ignored by the device server.
    ...
    The DPOFUA bit is reserved for mode data sent with a MODE SELECT
    command.
    ...

    The remaining bits in the DEVICE-SPECIFIC PARAMETER byte are also reserved
    and shall be set to zero.

    [mkp: shuffled commentary to commit description]

    Cc: stable@vger.kernel.org
    Signed-off-by: Ivan Mironov
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Ivan Mironov
     
  • commit 3f7e62bba0003f9c68f599f5997c4647ef5b4f4e upstream.

    The commit 356fd2663cff ("scsi: Set request queue runtime PM status back to
    active on resume") fixed up the inconsistent RPM status between request
    queue and device. However changing request queue RPM status shall be done
    only on successful resume, otherwise status may be still inconsistent as
    below,

    Request queue: RPM_ACTIVE
    Device: RPM_SUSPENDED

    This ends up soft lockup because requests can be submitted to underlying
    devices but those devices and their required resource are not resumed.

    For example,

    After above inconsistent status happens, IO request can be submitted to UFS
    device driver but required resource (like clock) is not resumed yet thus
    lead to warning as below call stack,

    WARN_ON(hba->clk_gating.state != CLKS_ON);
    ufshcd_queuecommand
    scsi_dispatch_cmd
    scsi_request_fn
    __blk_run_queue
    cfq_insert_request
    __elv_add_request
    blk_flush_plug_list
    blk_finish_plug
    jbd2_journal_commit_transaction
    kjournald2

    We may see all behind IO requests hang because of no response from storage
    host or device and then soft lockup happens in system. In the end, system
    may crash in many ways.

    Fixes: 356fd2663cff (scsi: Set request queue runtime PM status back to active on resume)
    Cc: stable@vger.kernel.org
    Signed-off-by: Stanley Chu
    Reviewed-by: Bart Van Assche
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Stanley Chu
     

13 Jan, 2019

2 commits

  • commit 4e87eb2f46ea547d12a276b2e696ab934d16cfb6 upstream.

    Certain older adapters such as the OneConnect OCe10100 may not have a valid
    wqpcnt value. In this case, do not set queue->page_count to 0 in
    lpfc_sli4_queue_alloc() as this will prevent the driver from initializing.

    Fixes: 895427bd01 ("scsi: lpfc: NVME Initiator: Base modifications")
    Cc: stable@vger.kernel.org # 4.11+
    Signed-off-by: Ewan D. Milne
    Reviewed-by: Laurence Oberman
    Tested-by: Laurence Oberman
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Ewan D. Milne
     
  • [ Upstream commit 9ae4f8420ed7be4b13c96600e3568c144d101a23 ]

    If "interface" is NULL then we can't release it and trying to will only
    lead to an Oops.

    Fixes: aea71a024914 ("[SCSI] bnx2fc: Introduce interface structure for each vlan interface")
    Signed-off-by: Dan Carpenter
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin

    Dan Carpenter
     

29 Dec, 2018

1 commit

  • commit 61cce6f6eeced5ddd9cac55e807fe28b4f18c1ba upstream.

    When boxes are run near (or to) OOM, we have a problem with the discard
    page allocation in sd. If we fail allocating the special page, we return
    busy, and it'll get retried. But since ordering is honored for dispatch
    requests, we can keep retrying this same IO and failing. Behind that IO
    could be requests that want to free memory, but they never get the
    chance. This means you get repeated spews of traces like this:

    [1201401.625972] Call Trace:
    [1201401.631748] dump_stack+0x4d/0x65
    [1201401.639445] warn_alloc+0xec/0x190
    [1201401.647335] __alloc_pages_slowpath+0xe84/0xf30
    [1201401.657722] ? get_page_from_freelist+0x11b/0xb10
    [1201401.668475] ? __alloc_pages_slowpath+0x2e/0xf30
    [1201401.679054] __alloc_pages_nodemask+0x1f9/0x210
    [1201401.689424] alloc_pages_current+0x8c/0x110
    [1201401.699025] sd_setup_write_same16_cmnd+0x51/0x150
    [1201401.709987] sd_init_command+0x49c/0xb70
    [1201401.719029] scsi_setup_cmnd+0x9c/0x160
    [1201401.727877] scsi_queue_rq+0x4d9/0x610
    [1201401.736535] blk_mq_dispatch_rq_list+0x19a/0x360
    [1201401.747113] blk_mq_sched_dispatch_requests+0xff/0x190
    [1201401.758844] __blk_mq_run_hw_queue+0x95/0xa0
    [1201401.768653] blk_mq_run_work_fn+0x2c/0x30
    [1201401.777886] process_one_work+0x14b/0x400
    [1201401.787119] worker_thread+0x4b/0x470
    [1201401.795586] kthread+0x110/0x150
    [1201401.803089] ? rescuer_thread+0x320/0x320
    [1201401.812322] ? kthread_park+0x90/0x90
    [1201401.820787] ? do_syscall_64+0x53/0x150
    [1201401.829635] ret_from_fork+0x29/0x40

    Ensure that the discard page allocation has a mempool backing, so we
    know we can make progress.

    Cc: stable@vger.kernel.org
    Signed-off-by: Jens Axboe
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Jens Axboe
     

21 Dec, 2018

2 commits

  • [ Upstream commit 02f425f811cefcc4d325d7a72272651e622dc97e ]

    Currently pvscsi_remove calls free_irq more than once as
    pvscsi_release_resources and __pvscsi_shutdown both call
    pvscsi_shutdown_intr. This results in a 'Trying to free already-free IRQ'
    warning and stack trace. To solve the problem pvscsi_shutdown_intr has been
    moved out of pvscsi_release_resources.

    Signed-off-by: Cathy Avery
    Reviewed-by: Ewan D. Milne
    Reviewed-by: Dan Carpenter
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin

    Cathy Avery
     
  • [ Upstream commit 5db6dd14b31397e8cccaaddab2ff44ebec1acf25 ]

    This commit addresses NULL pointer dereference in iscsi_eh_session_reset.
    Reference should not be made to session->leadconn when session->state is
    set to ISCSI_STATE_TERMINATE.

    Signed-off-by: Fred Herard
    Reviewed-by: Konrad Rzeszutek Wilk
    Reviewed-by: Lee Duncan
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin

    Fred Herard
     

08 Dec, 2018

2 commits

  • commit 81df022b688d43d2a3667518b2f755d384397910 upstream.

    Cleanly fill memory for "vendor" and "model" with 0-bytes for the
    "compatible" case rather than adding only a single 0 byte. This
    simplifies the devinfo code a a bit, and avoids mistakes in other places
    of the code (not in current upstream, but we had one such mistake in the
    SUSE kernel).

    [mkp: applied by hand and added braces]

    Signed-off-by: Martin Wilck
    Reviewed-by: Bart Van Assche
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Martin Wilck
     
  • commit 8c5a50e8e7ad812a62f7ccf28d9a5e74fddf3000 upstream.

    The bfa driver has a number of real issues with string termination
    that gcc-8 now points out:

    drivers/scsi/bfa/bfad_bsg.c: In function 'bfad_iocmd_port_get_attr':
    drivers/scsi/bfa/bfad_bsg.c:320:9: error: argument to 'sizeof' in 'strncpy' call is the same expression as the source; did you mean to use the size of the destination? [-Werror=sizeof-pointer-memaccess]
    drivers/scsi/bfa/bfa_fcs.c: In function 'bfa_fcs_fabric_psymb_init':
    drivers/scsi/bfa/bfa_fcs.c:775:9: error: argument to 'sizeof' in 'strncat' call is the same expression as the source; did you mean to use the size of the destination? [-Werror=sizeof-pointer-memaccess]
    drivers/scsi/bfa/bfa_fcs.c:781:9: error: argument to 'sizeof' in 'strncat' call is the same expression as the source; did you mean to use the size of the destination? [-Werror=sizeof-pointer-memaccess]
    drivers/scsi/bfa/bfa_fcs.c:788:9: error: argument to 'sizeof' in 'strncat' call is the same expression as the source; did you mean to use the size of the destination? [-Werror=sizeof-pointer-memaccess]
    drivers/scsi/bfa/bfa_fcs.c:801:10: error: argument to 'sizeof' in 'strncat' call is the same expression as the source; did you mean to use the size of the destination? [-Werror=sizeof-pointer-memaccess]
    drivers/scsi/bfa/bfa_fcs.c:808:10: error: argument to 'sizeof' in 'strncat' call is the same expression as the source; did you mean to use the size of the destination? [-Werror=sizeof-pointer-memaccess]
    drivers/scsi/bfa/bfa_fcs.c: In function 'bfa_fcs_fabric_nsymb_init':
    drivers/scsi/bfa/bfa_fcs.c:837:10: error: argument to 'sizeof' in 'strncat' call is the same expression as the source; did you mean to use the size of the destination? [-Werror=sizeof-pointer-memaccess]
    drivers/scsi/bfa/bfa_fcs.c:844:10: error: argument to 'sizeof' in 'strncat' call is the same expression as the source; did you mean to use the size of the destination? [-Werror=sizeof-pointer-memaccess]
    drivers/scsi/bfa/bfa_fcs.c:852:10: error: argument to 'sizeof' in 'strncat' call is the same expression as the source; did you mean to use the size of the destination? [-Werror=sizeof-pointer-memaccess]
    drivers/scsi/bfa/bfa_fcs.c: In function 'bfa_fcs_fabric_psymb_init':
    drivers/scsi/bfa/bfa_fcs.c:778:2: error: 'strncat' output may be truncated copying 10 bytes from a string of length 63 [-Werror=stringop-truncation]
    drivers/scsi/bfa/bfa_fcs.c:784:2: error: 'strncat' output may be truncated copying 30 bytes from a string of length 63 [-Werror=stringop-truncation]
    drivers/scsi/bfa/bfa_fcs.c:803:3: error: 'strncat' output may be truncated copying 44 bytes from a string of length 63 [-Werror=stringop-truncation]
    drivers/scsi/bfa/bfa_fcs.c:811:3: error: 'strncat' output may be truncated copying 16 bytes from a string of length 63 [-Werror=stringop-truncation]
    drivers/scsi/bfa/bfa_fcs.c: In function 'bfa_fcs_fabric_nsymb_init':
    drivers/scsi/bfa/bfa_fcs.c:840:2: error: 'strncat' output may be truncated copying 10 bytes from a string of length 63 [-Werror=stringop-truncation]
    drivers/scsi/bfa/bfa_fcs.c:847:2: error: 'strncat' output may be truncated copying 30 bytes from a string of length 63 [-Werror=stringop-truncation]
    drivers/scsi/bfa/bfa_fcs_lport.c: In function 'bfa_fcs_fdmi_get_hbaattr':
    drivers/scsi/bfa/bfa_fcs_lport.c:2657:10: error: argument to 'sizeof' in 'strncat' call is the same expression as the source; did you mean to use the size of the destination? [-Werror=sizeof-pointer-memaccess]
    drivers/scsi/bfa/bfa_fcs_lport.c:2659:11: error: argument to 'sizeof' in 'strncat' call is the same expression as the source; did you mean to use the size of the destination? [-Werror=sizeof-pointer-memaccess]
    drivers/scsi/bfa/bfa_fcs_lport.c: In function 'bfa_fcs_lport_ms_gmal_response':
    drivers/scsi/bfa/bfa_fcs_lport.c:3232:5: error: 'strncpy' output may be truncated copying 16 bytes from a string of length 247 [-Werror=stringop-truncation]
    drivers/scsi/bfa/bfa_fcs_lport.c: In function 'bfa_fcs_lport_ns_send_rspn_id':
    drivers/scsi/bfa/bfa_fcs_lport.c:4670:3: error: 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length [-Werror=stringop-truncation]
    drivers/scsi/bfa/bfa_fcs_lport.c:4682:3: error: 'strncat' output truncated before terminating nul copying as many bytes from a string as its length [-Werror=stringop-truncation]
    drivers/scsi/bfa/bfa_fcs_lport.c: In function 'bfa_fcs_lport_ns_util_send_rspn_id':
    drivers/scsi/bfa/bfa_fcs_lport.c:5206:3: error: 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length [-Werror=stringop-truncation]
    drivers/scsi/bfa/bfa_fcs_lport.c:5215:3: error: 'strncat' output truncated before terminating nul copying as many bytes from a string as its length [-Werror=stringop-truncation]
    drivers/scsi/bfa/bfa_fcs_lport.c: In function 'bfa_fcs_fdmi_get_portattr':
    drivers/scsi/bfa/bfa_fcs_lport.c:2751:2: error: 'strncpy' specified bound 128 equals destination size [-Werror=stringop-truncation]
    drivers/scsi/bfa/bfa_fcbuild.c: In function 'fc_rspnid_build':
    drivers/scsi/bfa/bfa_fcbuild.c:1254:2: error: 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length [-Werror=stringop-truncation]
    drivers/scsi/bfa/bfa_fcbuild.c:1253:25: note: length computed here
    drivers/scsi/bfa/bfa_fcbuild.c: In function 'fc_rsnn_nn_build':
    drivers/scsi/bfa/bfa_fcbuild.c:1275:2: error: 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length [-Werror=stringop-truncation]

    In most cases, this can be addressed by correctly calling strlcpy and
    strlcat instead of strncpy/strncat, with the size of the destination
    buffer as the last argument.

    For consistency, I'm changing the other callers of strncpy() in this
    driver the same way.

    Signed-off-by: Arnd Bergmann
    Reviewed-by: Johannes Thumshirn
    Acked-by: Sudarsana Kalluru
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Arnd Bergmann
     

21 Nov, 2018

7 commits

  • commit 8dc765d438f1e42b3e8227b3b09fad7d73f4ec9a upstream.

    c2856ae2f315d ("blk-mq: quiesce queue before freeing queue") has
    already fixed this race, however the implied synchronize_rcu()
    in blk_mq_quiesce_queue() can slow down LUN probe a lot, so caused
    performance regression.

    Then 1311326cf4755c7 ("blk-mq: avoid to synchronize rcu inside blk_cleanup_queue()")
    tried to quiesce queue for avoiding unnecessary synchronize_rcu()
    only when queue initialization is done, because it is usual to see
    lots of inexistent LUNs which need to be probed.

    However, turns out it isn't safe to quiesce queue only when queue
    initialization is done. Because when one SCSI command is completed,
    the user of sending command can be waken up immediately, then the
    scsi device may be removed, meantime the run queue in scsi_end_request()
    is still in-progress, so kernel panic can be caused.

    In Red Hat QE lab, there are several reports about this kind of kernel
    panic triggered during kernel booting.

    This patch tries to address the issue by grabing one queue usage
    counter during freeing one request and the following run queue.

    Fixes: 1311326cf4755c7 ("blk-mq: avoid to synchronize rcu inside blk_cleanup_queue()")
    Cc: Andrew Jones
    Cc: Bart Van Assche
    Cc: linux-scsi@vger.kernel.org
    Cc: Martin K. Petersen
    Cc: Christoph Hellwig
    Cc: James E.J. Bottomley
    Cc: stable
    Cc: jianchao.wang
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Ming Lei
     
  • commit f635e48e866ee1a47d2d42ce012fdcc07bf55853 upstream.

    This patch initializes port speed so that firmware does not set lower
    operating speed. Setting lower speed in firmware impacts WRITE perfomance.

    Fixes: 726b85487067 ("qla2xxx: Add framework for async fabric discovery")
    Cc:
    Signed-off-by: Quinn Tran
    Signed-off-by: Himanshu Madhani
    Tested-by: Laurence Oberman
    Reviewed-by: Ewan D. Milne
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Quinn Tran
     
  • commit 5c6400536481d9ef44ef94e7bf2c7b8e81534db7 upstream.

    This patch fixes issue where driver clears NPort ID map instead of marking
    handle in use. Once driver clears NPort ID from the database, it can reuse
    the same NPort ID resulting in a PLOGI failure.

    [mkp: fixed Himanshu's SoB]

    Fixes: a084fd68e1d2 ("scsi: qla2xxx: Fix re-login for Nport Handle in use")
    Cc:
    Signed-of-by: Quinn Tran
    Reviewed-by: Ewan D. Milne
    Signed-off-by: Himanshu Madhani
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Quinn Tran
     
  • commit 1e4ac5d6fe0a4af17e4b6251b884485832bf75a3 upstream.

    If chip unable to fully initialize, use full shutdown sequence to clear out
    any stale FW state.

    Fixes: e315cd28b9ef ("[SCSI] qla2xxx: Code changes for qla data structure refactoring")
    Cc: stable@vger.kernel.org #4.10
    Signed-off-by: Quinn Tran
    Signed-off-by: Himanshu Madhani
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Quinn Tran
     
  • commit 7c388f91ec1a59b0ed815b07b90536e2d57e1e1f upstream.

    Remove stale debug trace.

    Fixes: 1eb42f965ced ("qla2xxx: Make trace flags more readable")
    Cc: stable@vger.kernel.org #4.10
    Signed-off-by: Quinn Tran
    Signed-off-by: Himanshu Madhani
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Quinn Tran
     
  • commit b86ac8fd4b2f6ec2f9ca9194c56eac12d620096f upstream.

    This patch improves performance for 16G and above adapter by removing
    additional call to process_response_queue().

    [mkp: typo]

    Cc:
    Signed-off-by: Quinn Tran
    Signed-off-by: Himanshu Madhani
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Quinn Tran
     
  • commit 4c1458df9635c7e3ced155f594d2e7dfd7254e21 upstream.

    Fixes: 6246b8a1d26c7c ("[SCSI] qla2xxx: Enhancements to support ISP83xx.")
    Fixes: 1bb395485160d2 ("qla2xxx: Correct iiDMA-update calling conventions.")
    Cc:
    Signed-off-by: Himanshu Madhani
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Himanshu Madhani
     

14 Nov, 2018

4 commits

  • [ Upstream commit ca7fb76e091f889cfda1287c07a9358f73832b39 ]

    On io completion, the driver is taking an adapter wide lock and nulling the
    scsi command back pointer. The nulling of the back pointer is to signify the
    io was completed and the scsi_done() routine was called. However, the routine
    makes no check to see if the abort routine had done the same thing and
    possibly nulled the pointer. Thus it may doubly-complete the io.

    Make the following mods:

    - Check to make sure forward progress (call scsi_done()) only happens if the
    command pointer was non-null.

    - As the taking of the lock, which is adapter wide, is very costly on a system
    under load, null the pointer using an xchg operation rather than under lock.

    Signed-off-by: Dick Kennedy
    Signed-off-by: James Smart
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    James Smart
     
  • [ Upstream commit 0ef01a2d95fd62bb4f536e7ce4d5e8e74b97a244 ]

    When running an mds diagnostic that passes frames with the switch, soft
    lockups are detected. The driver is in a CQE processing loop and has
    sufficient amount of traffic that it never exits the ring processing routine,
    thus the "lockup".

    Cap the number of elements in the work processing routine to 64 elements. This
    ensures that the cpu will be given up and the handler reschedule to process
    additional items.

    Signed-off-by: Dick Kennedy
    Signed-off-by: James Smart
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    James Smart
     
  • [ Upstream commit 47db7873136a9c57c45390a53b57019cf73c8259 ]

    In megasas_mgmt_compat_ioctl_fw(), to handle the structure
    compat_megasas_iocpacket 'cioc', a user-space structure megasas_iocpacket
    'ioc' is allocated before megasas_mgmt_ioctl_fw() is invoked to handle
    the packet. Since the two data structures have different fields, the data
    is copied from 'cioc' to 'ioc' field by field. In the copy process,
    'sense_ptr' is prepared if the field 'sense_len' is not null, because it
    will be used in megasas_mgmt_ioctl_fw(). To prepare 'sense_ptr', the
    user-space data 'ioc->sense_off' and 'cioc->sense_off' are copied and
    saved to kernel-space variables 'local_sense_off' and 'user_sense_off'
    respectively. Given that 'ioc->sense_off' is also copied from
    'cioc->sense_off', 'local_sense_off' and 'user_sense_off' should have the
    same value. However, 'cioc' is in the user space and a malicious user can
    race to change the value of 'cioc->sense_off' after it is copied to
    'ioc->sense_off' but before it is copied to 'user_sense_off'. By doing
    so, the attacker can inject different values into 'local_sense_off' and
    'user_sense_off'. This can cause undefined behavior in the following
    execution, because the two variables are supposed to be same.

    This patch enforces a check on the two kernel variables 'local_sense_off'
    and 'user_sense_off' to make sure they are the same after the copy. In
    case they are not, an error code EINVAL will be returned.

    Signed-off-by: Wenwen Wang
    Acked-by: Sumit Saxena
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Wenwen Wang
     
  • [ Upstream commit fd47d919d0c336e7c22862b51ee94927ffea227a ]

    If a target disconnects during a PIO data transfer the command may fail
    when the target reconnects:

    scsi host1: DMA length is zero!
    scsi host1: cur adr[04380000] len[00000000]

    The scsi bus is then reset. This happens because the residual reached
    zero before the transfer was completed.

    The usual residual calculation relies on the Transfer Count registers.
    That works for DMA transfers but not for PIO transfers. Fix the problem
    by storing the PIO transfer residual and using that to correctly
    calculate bytes_sent.

    Fixes: 6fe07aaffbf0 ("[SCSI] m68k: new mac_esp scsi driver")
    Tested-by: Stan Johnson
    Signed-off-by: Finn Thain
    Tested-by: Michael Schmitz
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Finn Thain
     

04 Nov, 2018

4 commits

  • [ Upstream commit 597d74005ba85e87c256cd732128ebf7faf54247 ]

    The USB storage glue sets the try_rc_10_first flag in an attempt to
    avoid wedging poorly implemented legacy USB devices.

    If the device capacity is too large to be expressed in the provided
    response buffer field of READ CAPACITY(10), a well-behaved device will
    set the reported capacity to 0xFFFFFFFF. We will then attempt to issue a
    READ CAPACITY(16) to obtain the real capacity.

    Since this part of the discovery logic is not covered by the first_scan
    flag, a warning will be printed a couple of times times per revalidate
    attempt if we upgrade from READ CAPACITY(10) to READ CAPACITY(16).

    Remember that we have successfully issued READ CAPACITY(16) so we can
    take the fast path on subsequent revalidate attempts.

    Reported-by: Menion
    Reviewed-by: Laurence Oberman
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin

    Martin K. Petersen
     
  • [ Upstream commit 09dd15e0d9547ca424de4043bcd429bab6f285c8 ]

    Following an RSCN, ibmvfc will issue an ADISC to determine if the
    underlying target has changed, comparing the SCSI ID, WWPN, and WWNN to
    determine how to handle the rport in discovery. However, the comparison
    of the WWPN and WWNN was performing a memcmp between a big endian field
    against a CPU endian field, which resulted in the wrong answer on LE
    systems. This was observed as unexpected errors getting logged at boot
    time as targets were getting relogins when not needed.

    Signed-off-by: Brian King
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin

    Brian King
     
  • [ Upstream commit 3a9910d7b686546dcc9986e790af17e148f1c888 ]

    qla2x00_tmf_sp_done() now deletes the timer that will run
    qla2x00_tmf_iocb_timeout(), but doesn't check whether the timer already
    expired. Check the return value from del_timer() to avoid calling
    complete() a second time.

    Fixes: 4440e46d5db7 ("[SCSI] qla2xxx: Add IOCB Abort command asynchronous ...")
    Fixes: 1514839b3664 ("scsi: qla2xxx: Fix NULL pointer crash due to active ...")
    Signed-off-by: Ben Hutchings
    Acked-by: Himanshu Madhani
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin

    Ben Hutchings
     
  • [ Upstream commit d18539754d97876503275efc7d00a1901bb0cfad ]

    As reported by Meelis Roos, my previous patch causes an incorrect
    calculation of the timeout, through an undefined signed integer
    overflow:

    [ 12.228155] UBSAN: Undefined behaviour in drivers/scsi/aacraid/commsup.c:2514:49
    [ 12.228229] signed integer overflow:
    [ 12.228283] 964297611 * 250 cannot be represented in type 'long int'

    The problem is that doing a multiplication with HZ first and then
    dividing by USEC_PER_SEC worked correctly for 32-bit microseconds,
    but not for 32-bit nanoseconds, which would require up to 41 bits.

    This reworks the calculation to first convert the nanoseconds into
    jiffies, which should give us the same result as before and not overflow.

    Unfortunately I did not understand the exact intention of the algorithm,
    in particular the part where we add half a second, so it's possible that
    there is still a preexisting problem in this function. I added a comment
    that this would be handled more nicely using usleep_range(), which
    generally works better for waking up at a particular time than the
    current schedule_timeout() based implementation. I did not feel
    comfortable trying to implement that without being sure what the
    intent is here though.

    Fixes: 820f18865912 ("scsi: aacraid: use timespec64 instead of timeval")
    Tested-by: Meelis Roos
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin

    Arnd Bergmann
     

20 Oct, 2018

4 commits

  • [ Upstream commit f1f1fadacaf08b7cf11714c0c29f8fa4d4ef68a9 ]

    When sd_init_command() get's a command with a unknown req_op() it crashes the
    system via BUG().

    This makes debugging the actual reason for the broken request cmd_flags pretty
    hard as the system is down before it's able to write out debugging data on the
    serial console or the trace buffer.

    Change the BUG() to a WARN_ON() and return BLKPREP_KILL to fail gracefully and
    return an I/O error to the producer of the request.

    Signed-off-by: Johannes Thumshirn
    Cc: Hannes Reinecke
    Cc: Bart Van Assche
    Cc: Christoph Hellwig
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Bart Van Assche
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Johannes Thumshirn
     
  • [ Upstream commit 318ddb34b2052f838aa243d07173e2badf3e630e ]

    While dlpar adding primary ipr adapter back, driver goes through adapter
    initialization then schedule ipr_worker_thread to start te disk scan by
    dropping the host lock, calling scsi_add_device. Then get the adapter reset
    request again, so driver does scsi_block_requests, this will cause the
    scsi_add_device get hung until we unblock. But we can't run ipr_worker_thread
    to do the unblock because its stuck in scsi_add_device.

    This patch fixes the issue.

    [mkp: typo and whitespace fixes]

    Signed-off-by: Wen Xiong
    Acked-by: Brian King
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Wen Xiong
     
  • [ Upstream commit adad633af7b970bfa5dd1b624a4afc83cac9b235 ]

    While reviewing another part of the code, Kees noticed that the strncpy of the
    partition name might not always be NUL terminated. Switch to using strscpy
    which does this safely.

    Reported-by: Kees Cook
    Signed-off-by: Laura Abbott
    Reviewed-by: Kees Cook
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Laura Abbott
     
  • [ Upstream commit d792d4c4fc866ae224b0b0ca2aabd87d23b4d6cc ]

    There's currently a warning about string overflow with strncat:

    drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c: In function 'ibmvscsis_probe':
    drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c:3479:2: error: 'strncat' specified
    bound 64 equals destination size [-Werror=stringop-overflow=]
    strncat(vscsi->eye, vdev->name, MAX_EYE);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    Switch to a single snprintf instead of a strcpy + strcat to handle this
    cleanly.

    Signed-off-by: Laura Abbott
    Suggested-by: Kees Cook
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Laura Abbott
     

18 Oct, 2018

1 commit

  • [ Upstream commit cbe3fd39d223f14b1c60c80fe9347a3dd08c2edb ]

    We should first do the le16_to_cpu endian conversion and then apply the
    FCP_CMD_LENGTH_MASK mask.

    Fixes: 5f35509db179 ("qla2xxx: Terminate exchange if corrupted")
    Signed-off-by: Dan Carpenter
    Acked-by: Quinn Tran
    Acked-by: Himanshu Madhani
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Dan Carpenter
     

10 Oct, 2018

2 commits

  • [ Upstream commit c77a2fa3ff8f73d1a485e67e6f81c64823739d59 ]

    The QED driver commit, 1ac4329a1cff ("qed: Add configuration information
    to register dump and debug data"), removes the CRC length validation
    causing nvm_get_image failure while loading qedi driver:

    [qed_mcp_get_nvm_image:2700(host_10-0)]Image [0] is too big - 00006008 bytes
    where only 00006004 are available
    [qedi_get_boot_info:2253]:10: Could not get NVM image. ret = -12

    Hence add and adjust the CRC size to iSCSI NVM image to read boot info at
    qedi load time.

    Signed-off-by: Nilesh Javali
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Nilesh Javali
     
  • [ Upstream commit 89809b028b6f54187b7d81a0c69b35d394c52e62 ]

    Reported-by: Colin Ian King
    Signed-off-by: Varun Prakash
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Varun Prakash
     

04 Oct, 2018

3 commits

  • [ Upstream commit c3b10a55abc943a526aaecd7e860b15671beb906 ]

    There is a possibility that firmware on the controller was upgraded before
    system was suspended. During resume, driver needs to read updated
    controller properties.

    Signed-off-by: Shivasharan S
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Shivasharan S
     
  • [ Upstream commit aa154ea885eb0c2407457ce9c1538d78c95456fa ]

    When ioremap_nocache fails, the lack of error-handling code may cause
    unexpected results.

    This patch adds error-handling code after calling ioremap_nocache.

    Signed-off-by: Zhouyang Jia
    Reviewed-by: Johannes Thumshirn
    Acked-by: Manish Rangankar
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Zhouyang Jia
     
  • [ Upstream commit 1262dc09dc9ae7bf4ad00b6a2c5ed6a6936bcd10 ]

    Currently an open firmware property is copied into partition_name variable
    without keeping a room for \0.

    Later one, this variable (partition_name), which is 97 bytes long, is
    strncpyed into ibmvcsci_host_data->madapter_info->partition_name, which is
    96 bytes long, possibly truncating it 'again' and removing the \0.

    This patch simply decreases the partition name to 96 and just copy using
    strlcpy() which guarantees that the string is \0 terminated. I think there
    is no issue if this there is a truncation in this very first copy, i.e,
    when the open firmware property is read and copied into the driver for the
    very first time;

    This issue also causes the following warning on GCC 8:

    drivers/scsi/ibmvscsi/ibmvscsi.c:281:2: warning: strncpy output may be truncated copying 96 bytes from a string of length 96 [-Wstringop-truncation]
    ...
    inlined from ibmvscsi_probe at drivers/scsi/ibmvscsi/ibmvscsi.c:2221:7:
    drivers/scsi/ibmvscsi/ibmvscsi.c:265:3: warning: strncpy specified bound 97 equals destination size [-Wstringop-truncation]

    CC: Bart Van Assche
    CC: Tyrel Datwyler
    Signed-off-by: Breno Leitao
    Acked-by: Tyrel Datwyler
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Breno Leitao
     

26 Sep, 2018

1 commit

  • [ Upstream commit fa519f701d27198a2858bb108fc18ea9d8c106a7 ]

    fc_rport_login() will be calling mutex_lock() while running inside an
    RCU-protected section, triggering the warning 'sleeping function called
    from invalid context'. To fix this we can drop the rcu functions here
    altogether as the disc mutex protecting the list itself is already held,
    preventing any list manipulation.

    Fixes: a407c593398c ("scsi: libfc: Fixup disc_mutex handling")
    Signed-off-by: Hannes Reinecke
    Acked-by: Johannes Thumshirn
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Hannes Reinecke
     

20 Sep, 2018

1 commit

  • [ Upstream commit 4dc98c1995482262e70e83ef029135247fafe0f2 ]

    tw_probe() returns 0 in case of fail of tw_initialize_device_extension(),
    pci_resource_start() or tw_reset_sequence() and releases resources.
    twl_probe() returns 0 in case of fail of twl_initialize_device_extension(),
    pci_iomap() and twl_reset_sequence(). twa_probe() returns 0 in case of
    fail of tw_initialize_device_extension(), ioremap() and
    twa_reset_sequence().

    The patch adds retval initialization for these cases.

    Found by Linux Driver Verification project (linuxtesting.org).

    Signed-off-by: Anton Vasilyev
    Acked-by: Adam Radford
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Anton Vasilyev