18 Apr, 2019
1 commit
-
errata:
When a read command returns less data than specified in the PRDs (for
example, there are two PRDs for this command, but the device returns a
number of bytes which is less than in the first PRD), the second PRD of
this command is not read out of the PRD FIFO, causing the next command
to use this PRD erroneously.workaround
- forces sg_tablesize = 1
- modified the sg_io function in block/scsi_ioctl.c to use a 64k buffer
allocated with dma_alloc_coherent during the probe in ahci_imx
- In order to fix the scsi/sata hang, when CD_ROM and HDD are
accessed simultaneously after the workaround is applied.
Do not go to sleep in scsi_eh_handler, when there is host failed.Signed-off-by: Richard Zhu
[Arul: Fix merge conflicts]
Signed-off-by: Arulpandiyan Vadivel
06 Apr, 2019
5 commits
-
[ Upstream commit 8beb90aaf334a6efa3e924339926b5f93a234dbb ]
commit 1917d42d14b7 ("fcoe: use enum for fip_mode") introduces a separate
enum for the fip_mode that shall be used during initialisation handling
until it is passed to fcoe_ctrl_link_up to set the initial fip_state. That
change was incomplete and gcc quietly converted in various places between
the fip_mode and the fip_state enum values with implicit enum conversions,
which fortunately cannot cause any issues in the actual code's execution.clang however warns about these implicit enum conversions in the scsi
drivers. This commit consolidates the use of the two enums, guided by
clang's enum-conversion warnings.This commit now completes the use of the fip_mode: It expects and uses
fip_mode in {bnx2fc,fcoe}_interface_create and fcoe_ctlr_init, and it calls
fcoe_ctrl_set_set() with the correct values in fcoe_ctlr_link_up(). It
also breaks the association between FIP_MODE_AUTO and FIP_ST_AUTO to
indicate these two enums are distinct.Link: https://github.com/ClangBuiltLinux/linux/issues/151
Fixes: 1917d42d14b7 ("fcoe: use enum for fip_mode")
Reported-by: Dmitry Golovin
Original-by: Lukas Bulwahn
CC: Lukas Bulwahn
CC: Nick Desaulniers
CC: Nathan Chancellor
Reviewed-by: Nathan Chancellor
Tested-by: Nathan Chancellor
Suggested-by: Johannes Thumshirn
Signed-off-by: Sedat Dilek
Signed-off-by: Hannes Reinecke
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit bcf3b67d16a4c8ffae0aa79de5853435e683945c ]
when create DMA pool for cmd frames failed, we should return -ENOMEM,
instead of 0.
In some case in:megasas_init_adapter_fusion()
-->megasas_alloc_cmds()
-->megasas_create_frame_pool
create DMA pool failed,
--> megasas_free_cmds() [1]-->megasas_alloc_cmds_fusion()
failed, then goto fail_alloc_cmds.
-->megasas_free_cmds() [2]we will call megasas_free_cmds twice, [1] will kfree cmd_list,
[2] will use cmd_list.it will cause a problem:Unable to handle kernel NULL pointer dereference at virtual address
00000000
pgd = ffffffc000f70000
[00000000] *pgd=0000001fbf893003, *pud=0000001fbf893003,
*pmd=0000001fbf894003, *pte=006000006d000707
Internal error: Oops: 96000005 [#1] SMP
Modules linked in:
CPU: 18 PID: 1 Comm: swapper/0 Not tainted
task: ffffffdfb9290000 ti: ffffffdfb923c000 task.ti: ffffffdfb923c000
PC is at megasas_free_cmds+0x30/0x70
LR is at megasas_free_cmds+0x24/0x70
...
Call trace:
[] megasas_free_cmds+0x30/0x70
[] megasas_init_adapter_fusion+0x2f4/0x4d8
[] megasas_init_fw+0x2dc/0x760
[] megasas_probe_one+0x3c0/0xcd8
[] local_pci_probe+0x4c/0xb4
[] pci_device_probe+0x11c/0x14c
[] driver_probe_device+0x1ec/0x430
[] __driver_attach+0xa8/0xb0
[] bus_for_each_dev+0x74/0xc8
[] driver_attach+0x28/0x34
[] bus_add_driver+0x16c/0x248
[] driver_register+0x6c/0x138
[] __pci_register_driver+0x5c/0x6c
[] megasas_init+0xc0/0x1a8
[] do_one_initcall+0xe8/0x1ec
[] kernel_init_freeable+0x1c8/0x284
[] kernel_init+0x1c/0xe4Signed-off-by: Jason Yan
Acked-by: Sumit Saxena
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit 1749ef00f7312679f76d5e9104c5d1e22a829038 ]
We had a test-report where, under memory pressure, adding LUNs to the
systems would fail (the tests add LUNs strictly in sequence):[ 5525.853432] scsi 0:0:1:1088045124: Direct-Access IBM 2107900 .148 PQ: 0 ANSI: 5
[ 5525.853826] scsi 0:0:1:1088045124: alua: supports implicit TPGS
[ 5525.853830] scsi 0:0:1:1088045124: alua: device naa.6005076303ffd32700000000000044da port group 0 rel port 43
[ 5525.853931] sd 0:0:1:1088045124: Attached scsi generic sg10 type 0
[ 5525.854075] sd 0:0:1:1088045124: [sdk] Disabling DIF Type 1 protection
[ 5525.855495] sd 0:0:1:1088045124: [sdk] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB)
[ 5525.855606] sd 0:0:1:1088045124: [sdk] Write Protect is off
[ 5525.855609] sd 0:0:1:1088045124: [sdk] Mode Sense: ed 00 00 08
[ 5525.855795] sd 0:0:1:1088045124: [sdk] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 5525.857838] sdk: sdk1
[ 5525.859468] sd 0:0:1:1088045124: [sdk] Attached SCSI disk
[ 5525.865073] sd 0:0:1:1088045124: alua: transition timeout set to 60 seconds
[ 5525.865078] sd 0:0:1:1088045124: alua: port group 00 state A preferred supports tolusnA
[ 5526.015070] sd 0:0:1:1088045124: alua: port group 00 state A preferred supports tolusnA
[ 5526.015213] sd 0:0:1:1088045124: alua: port group 00 state A preferred supports tolusnA
[ 5526.587439] scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configured
[ 5526.588562] scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configuredLooking at the code of scsi_alloc_sdev(), and all the calling contexts,
there seems to be no reason to use GFP_ATMOIC here. All the different
call-contexts use a mutex at some point, and nothing in between that
requires no sleeping, as far as I could see. Additionally, the code that
later allocates the block queue for the device (scsi_mq_alloc_queue())
already uses GFP_KERNEL.There are similar allocations in two other functions:
scsi_probe_and_add_lun(), and scsi_add_lun(),; that can also be done with
GFP_KERNEL.Here is the contexts for the three functions so far:
scsi_alloc_sdev()
scsi_probe_and_add_lun()
scsi_sequential_lun_scan()
__scsi_scan_target()
scsi_scan_target()
mutex_lock()
scsi_scan_channel()
scsi_scan_host_selected()
mutex_lock()
scsi_report_lun_scan()
__scsi_scan_target()
...
__scsi_add_device()
mutex_lock()
__scsi_scan_target()
...
scsi_report_lun_scan()
...
scsi_get_host_dev()
mutex_lock()scsi_probe_and_add_lun()
...scsi_add_lun()
scsi_probe_and_add_lun()
...So replace all these, and give them a bit of a better chance to succeed,
with more chances of reclaim.Signed-off-by: Benjamin Block
Reviewed-by: Bart Van Assche
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit 4790595723d4b833b18c994973d39f9efb842887 ]
For internal IO and SMP IO, there is a time-out timer for them. In the
timer handler, it checks whether IO is done according to the flag
task->task_state_lock.There is an issue which may cause system suspended: internal IO or SMP IO
is sent, but at that time because of hardware exception (such as inject
2Bit ECC error), so IO is not completed and also not timeout. But, at that
time, the SAS controller reset occurs to recover system. It will release
the resource and set the status of IO to be SAS_TASK_STATE_DONE, so when IO
timeout, it will never complete the completion of IO and wait for ever.[ 729.123632] Call trace:
[ 729.126791] [] __switch_to+0x94/0xa8
[ 729.133106] [] __schedule+0x1e8/0x7fc
[ 729.138975] [] schedule+0x34/0x8c
[ 729.144401] [] schedule_timeout+0x1d8/0x3cc
[ 729.150690] [] wait_for_common+0xdc/0x1a0
[ 729.157101] [] wait_for_completion+0x28/0x34
[ 729.165973] [] hisi_sas_internal_task_abort+0x2a0/0x424 [hisi_sas_test_main]
[ 729.176447] [] hisi_sas_abort_task+0x244/0x2d8 [hisi_sas_test_main]
[ 729.185258] [] sas_eh_handle_sas_errors+0x1c8/0x7b8
[ 729.192391] [] sas_scsi_recover_host+0x130/0x398
[ 729.199237] [] scsi_error_handler+0x148/0x5c0
[ 729.206009] [] kthread+0x10c/0x138
[ 729.211563] [] ret_from_fork+0x10/0x18To solve the issue, callback function task_done of those IOs need to be
called when on SAS controller reset.Signed-off-by: Xiang Chen
Signed-off-by: John Garry
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit efdcad62e7b8a02fcccc5ccca57806dce1482ac8 ]
When the PHY comes down, we currently do not set the negotiated linkrate:
root@(none)$ pwd
/sys/class/sas_phy/phy-0:0
root@(none)$ more enable
1
root@(none)$ more negotiated_linkrate
12.0 Gbit
root@(none)$ echo 0 > enable
root@(none)$ more negotiated_linkrate
12.0 Gbit
root@(none)$This patch fixes the driver code to set it properly when the PHY comes
down.If the PHY had been enabled, then set unknown; otherwise, flag as disabled.
The logical place to set the negotiated linkrate for this scenario is PHY
down routine, which is called from the PHY down ISR.However, it is not possible to know if the PHY comes down due to PHY
disable or loss of link, as sas_phy.enabled member is not set until after
the transport disable routine is complete, which races with the PHY down
ISR.As an imperfect solution, use sas_phy_data.enable as the flag to know if
the PHY is down due to disable. It's imperfect, as sas_phy_data is internal
to libsas.I can't see another way without adding a new field to hisi_sas_phy and
managing it, or changing SCSI SAS transport.Signed-off-by: John Garry
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin
03 Apr, 2019
2 commits
-
commit 1d5de5bd311be7cd54f02f7cd164f0349a75c876 upstream.
Commit a83da8a4509d ("scsi: sd: Optimal I/O size should be a multiple
of physical block size") split one conditional into several separate
statements in an effort to provide more accurate warning messages when
a device reports a nonsensical value. However, this reorganization
accidentally dropped the precondition of the reported value being
larger than zero. This lead to a warning getting emitted on devices
that do not report an optimal I/O size at all.Remain silent if a device does not report an optimal I/O size.
Fixes: a83da8a4509d ("scsi: sd: Optimal I/O size should be a multiple of physical block size")
Cc: Randy Dunlap
Cc:
Reported-by: Hussam Al-Tayeb
Tested-by: Hussam Al-Tayeb
Reviewed-by: Bart Van Assche
Signed-off-by: Martin K. Petersen
Signed-off-by: Greg Kroah-Hartman -
commit c14a57264399efd39514a2329c591a4b954246d8 upstream.
The scsi_end_request() function calls scsi_cmd_to_driver() indirectly and
hence needs the disk->private_data pointer. Avoid that that pointer is
cleared before all affected I/O requests have finished. This patch avoids
that the following crash occurs:Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Call trace:
scsi_mq_uninit_cmd+0x1c/0x30
scsi_end_request+0x7c/0x1b8
scsi_io_completion+0x464/0x668
scsi_finish_command+0xbc/0x160
scsi_eh_flush_done_q+0x10c/0x170
sas_scsi_recover_host+0x84c/0xa98 [libsas]
scsi_error_handler+0x140/0x5b0
kthread+0x100/0x12c
ret_from_fork+0x10/0x18Cc: Christoph Hellwig
Cc: Ming Lei
Cc: Hannes Reinecke
Cc: Johannes Thumshirn
Cc: Jason Yan
Cc:
Signed-off-by: Bart Van Assche
Reported-by: Jason Yan
Reviewed-by: Christoph Hellwig
Signed-off-by: Martin K. Petersen
Signed-off-by: Greg Kroah-Hartman
27 Mar, 2019
2 commits
-
commit 7f5203c13ba8a7b7f9f6ecfe5a4d5567188d7835 upstream.
The event pool used for queueing commands is destroyed fairly early in the
ibmvscsi_remove() code path. Since, this happens prior to the call so
scsi_remove_host() it is possible for further calls to queuecommand to be
processed which manifest as a panic due to a NULL pointer dereference as
seen here:PANIC: "Unable to handle kernel paging request for data at address
0x00000000"Context process backtrace:
DSISR: 0000000042000000 ????Syscall Result: 0000000000000000
4 [c000000002cb3820] memcpy_power7 at c000000000064204
[Link Register] [c000000002cb3820] ibmvscsi_send_srp_event at d000000003ed14a4
5 [c000000002cb3920] ibmvscsi_send_srp_event at d000000003ed14a4 [ibmvscsi] ?(unreliable)
6 [c000000002cb39c0] ibmvscsi_queuecommand at d000000003ed2388 [ibmvscsi]
7 [c000000002cb3a70] scsi_dispatch_cmd at d00000000395c2d8 [scsi_mod]
8 [c000000002cb3af0] scsi_request_fn at d00000000395ef88 [scsi_mod]
9 [c000000002cb3be0] __blk_run_queue at c000000000429860
10 [c000000002cb3c10] blk_delay_work at c00000000042a0ec
11 [c000000002cb3c40] process_one_work at c0000000000dac30
12 [c000000002cb3cd0] worker_thread at c0000000000db110
13 [c000000002cb3d80] kthread at c0000000000e3378
14 [c000000002cb3e30] ret_from_kernel_thread at c00000000000982cThe kernel buffer log is overfilled with this log:
[11261.952732] ibmvscsi: found no event struct in pool!
This patch reorders the operations during host teardown. Start by calling
the SRP transport and Scsi_Host remove functions to flush any outstanding
work and set the host offline. LLDD teardown follows including destruction
of the event pool, freeing the Command Response Queue (CRQ), and unmapping
any persistent buffers. The event pool destruction is protected by the
scsi_host lock, and the pool is purged prior of any requests for which we
never received a response. Finally, move the removal of the scsi host from
our global list to the end so that the host is easily locatable for
debugging purposes during teardown.Cc: # v2.6.12+
Signed-off-by: Tyrel Datwyler
Signed-off-by: Martin K. Petersen
Signed-off-by: Greg Kroah-Hartman -
commit 7205981e045e752ccf96cf6ddd703a98c59d4339 upstream.
For each ibmvscsi host created during a probe or destroyed during a remove
we either add or remove that host to/from the global ibmvscsi_head
list. This runs the risk of concurrent modification.This patch adds a simple spinlock around the list modification calls to
prevent concurrent updates as is done similarly in the ibmvfc driver and
ipr driver.Fixes: 32d6e4b6e4ea ("scsi: ibmvscsi: add vscsi hosts to global list_head")
Cc: # v4.10+
Signed-off-by: Tyrel Datwyler
Signed-off-by: Martin K. Petersen
Signed-off-by: Greg Kroah-Hartman
24 Mar, 2019
6 commits
-
commit ec322937a7f152d68755dc8316523bf6f831b48f upstream.
This patch fixes LUN discovery when loop ID is not yet assigned by the
firmware during driver load/sg_reset operations. Driver will now search for
new loop id before retrying login.Fixes: 48acad099074 ("scsi: qla2xxx: Fix N2N link re-connect")
Cc: stable@vger.kernel.org #4.19
Signed-off-by: Himanshu Madhani
Signed-off-by: Martin K. Petersen
Signed-off-by: Greg Kroah-Hartman -
commit a83da8a4509d3ebfe03bb7fffce022e4d5d4764f upstream.
It was reported that some devices report an OPTIMAL TRANSFER LENGTH of
0xFFFF blocks. That looks bogus, especially for a device with a
4096-byte physical block size.Ignore OPTIMAL TRANSFER LENGTH if it is not a multiple of the device's
reported physical block size.To make the sanity checking conditionals more readable--and to
facilitate printing warnings--relocate the checking to a helper
function. No functional change aside from the printks.Cc:
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199759
Reported-by: Christoph Anton Mitterer
Reviewed-by: Christoph Hellwig
Signed-off-by: Martin K. Petersen
Signed-off-by: Greg Kroah-Hartman -
commit 0015437cc046e5ec2b57b00ff8312b8d432eac7c upstream.
Fix performance issue where the queue depth for SmartIOC logical volumes is
set to 1, and allow the usual logical volume code to be executedFixes: a052865fe287 (aacraid: Set correct Queue Depth for HBA1000 RAW disks)
Cc: stable@vger.kernel.org
Signed-off-by: Sagar Biradar
Reviewed-by: Dave Carroll
Signed-off-by: Martin K. Petersen
Signed-off-by: Greg Kroah-Hartman -
commit 3722e6a52174d7c3a00e6f5efd006ca093f346c1 upstream.
The virtio scsi spec defines struct virtio_scsi_ctrl_tmf as a set of
device-readable records and a single device-writable response entry:struct virtio_scsi_ctrl_tmf
{
// Device-readable part
le32 type;
le32 subtype;
u8 lun[8];
le64 id;
// Device-writable part
u8 response;
}The above should be organised as two descriptor entries (or potentially
more if using VIRTIO_F_ANY_LAYOUT), but without any extra data after "le64
id" or after "u8 response".The Linux driver doesn't respect that, with virtscsi_abort() and
virtscsi_device_reset() setting cmd->sc before calling virtscsi_tmf(). It
results in the original scsi command payload (or writable buffers) added to
the tmf.This fixes the problem by leaving cmd->sc zeroed out, which makes
virtscsi_kick_cmd() add the tmf to the control vq without any payload.Cc: stable@vger.kernel.org
Signed-off-by: Felipe Franciosi
Reviewed-by: Paolo Bonzini
Signed-off-by: Martin K. Petersen
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 79edd00dc6a96644d76b4a1cb97d94d49e026768 ]
When a target sends Check Condition, whilst initiator is busy xmiting
re-queued data, could lead to race between iscsi_complete_task() and
iscsi_xmit_task() and eventually crashing with the following kernel
backtrace.[3326150.987523] ALERT: BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
[3326150.987549] ALERT: IP: [] iscsi_xmit_task+0x2d/0xc0 [libiscsi]
[3326150.987571] WARN: PGD 569c8067 PUD 569c9067 PMD 0
[3326150.987582] WARN: Oops: 0002 [#1] SMP
[3326150.987593] WARN: Modules linked in: tun nfsv3 nfs fscache dm_round_robin
[3326150.987762] WARN: CPU: 2 PID: 8399 Comm: kworker/u32:1 Tainted: G O 4.4.0+2 #1
[3326150.987774] WARN: Hardware name: Dell Inc. PowerEdge R720/0W7JN5, BIOS 2.5.4 01/22/2016
[3326150.987790] WARN: Workqueue: iscsi_q_13 iscsi_xmitworker [libiscsi]
[3326150.987799] WARN: task: ffff8801d50f3800 ti: ffff8801f5458000 task.ti: ffff8801f5458000
[3326150.987810] WARN: RIP: e030:[] [] iscsi_xmit_task+0x2d/0xc0 [libiscsi]
[3326150.987825] WARN: RSP: e02b:ffff8801f545bdb0 EFLAGS: 00010246
[3326150.987831] WARN: RAX: 00000000ffffffc3 RBX: ffff880282d2ab20 RCX: ffff88026b6ac480
[3326150.987842] WARN: RDX: 0000000000000000 RSI: 00000000fffffe01 RDI: ffff880282d2ab20
[3326150.987852] WARN: RBP: ffff8801f545bdc8 R08: 0000000000000000 R09: 0000000000000008
[3326150.987862] WARN: R10: 0000000000000000 R11: 000000000000fe88 R12: 0000000000000000
[3326150.987872] WARN: R13: ffff880282d2abe8 R14: ffff880282d2abd8 R15: ffff880282d2ac08
[3326150.987890] WARN: FS: 00007f5a866b4840(0000) GS:ffff88028a640000(0000) knlGS:0000000000000000
[3326150.987900] WARN: CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[3326150.987907] WARN: CR2: 0000000000000078 CR3: 0000000070244000 CR4: 0000000000042660
[3326150.987918] WARN: Stack:
[3326150.987924] WARN: ffff880282d2ad58 ffff880282d2ab20 ffff880282d2abe8 ffff8801f545be18
[3326150.987938] WARN: ffffffffa05cea90 ffff880282d2abf8 ffff88026b59cc80 ffff88026b59cc00
[3326150.987951] WARN: ffff88022acf32c0 ffff880289491800 ffff880255a80800 0000000000000400
[3326150.987964] WARN: Call Trace:
[3326150.987975] WARN: [] iscsi_xmitworker+0x2f0/0x360 [libiscsi]
[3326150.987988] WARN: [] process_one_work+0x1fc/0x3b0
[3326150.987997] WARN: [] worker_thread+0x2a5/0x470
[3326150.988006] WARN: [] ? __schedule+0x648/0x870
[3326150.988015] WARN: [] ? rescuer_thread+0x300/0x300
[3326150.988023] WARN: [] kthread+0xd5/0xe0
[3326150.988031] WARN: [] ? kthread_stop+0x110/0x110
[3326150.988040] WARN: [] ret_from_fork+0x3f/0x70
[3326150.988048] WARN: [] ? kthread_stop+0x110/0x110
[3326150.988127] ALERT: RIP [] iscsi_xmit_task+0x2d/0xc0 [libiscsi]
[3326150.988138] WARN: RSP
[3326150.988144] WARN: CR2: 0000000000000078
[3326151.020366] WARN: ---[ end trace 1c60974d4678d81b ]---Commit 6f8830f5bbab ("scsi: libiscsi: add lock around task lists to fix
list corruption regression") introduced "taskqueuelock" to fix list
corruption during the race, but this wasn't enough.Re-setting of conn->task to NULL, could race with iscsi_xmit_task().
iscsi_complete_task()
{
....
if (conn->task == task)
conn->task = NULL;
}conn->task in iscsi_xmit_task() could be NULL and so will be task.
__iscsi_get_task(task) will crash (NullPtr de-ref), trying to access
refcount.iscsi_xmit_task()
{
struct iscsi_task *task = conn->task;__iscsi_get_task(task);
}This commit will take extra conn->session->back_lock in iscsi_xmit_task()
to ensure iscsi_xmit_task() waits for iscsi_complete_task(), if
iscsi_complete_task() wins the race. If iscsi_xmit_task() wins the race,
iscsi_xmit_task() increments task->refcount
(__iscsi_get_task) ensuring iscsi_complete_task() will not iscsi_free_task().Signed-off-by: Anoob Soman
Signed-off-by: Bob Liu
Acked-by: Lee Duncan
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit 388a49959ee4e4e99f160241d9599efa62cd4299 ]
In qla2x00_async_tm_cmd, we reference off sp after it has been freed. This
caused a panic on a system running a slub debug kernel. Since fcport is
passed in anyways, just use that instead.Signed-off-by: Bill Kuzeja
Acked-by: Giridhar Malavali
Acked-by: Himanshu Madhani
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin
14 Mar, 2019
6 commits
-
commit 5e420fe635813e5746b296cfc8fff4853ae205a2 upstream.
Add missing break statement and fix identation issue.
This bug was found thanks to the ongoing efforts to enable
-Wimplicit-fallthrough.Fixes: 9cb62fa24e0d ("aacraid: Log firmware AIF messages")
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva
Signed-off-by: Martin K. Petersen
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit d8f6382a7d026989029e2e50c515df954488459b ]
This reverts commit bbc0f8bd88abefb0f27998f40a073634a3a2db89.
It added a warning whose intent was to check whether the rport was still
linked into the peer list. It doesn't work as intended and gives false
positive warnings for two reasons:1) If the rport is never linked into the peer list it will not be
considered empty since the list_head is never initialized.2) If the rport is deleted from the peer list using list_del_rcu(), then
the list_head is in an undefined state and it is not considered empty.Signed-off-by: Ross Lagerwall
Reviewed-by: Hannes Reinecke
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit 8437fcf14deed67e5ad90b5e8abf62fb20f30881 ]
The "hostdata->dev" pointer is NULL here. We set "hostdata->dev = dev;"
later in the function and we also use "hostdata->dev" when we call
dma_free_attrs() in NCR_700_release().This bug predates git version control.
Signed-off-by: Dan Carpenter
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit b2d3492fc591b1fb46b81d79ca1fc44cac6ae0ae ]
There are two issues here. First if cmgr->hba is not set early enough then
it leads to a NULL dereference. Second if we don't completely initialize
cmgr->io_bdt_pool[] then we end up dereferencing uninitialized pointers.Fixes: 853e2bd2103a ("[SCSI] bnx2fc: Broadcom FCoE offload driver")
Signed-off-by: Dan Carpenter
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit 40d07b523cf434f252b134c86b1f8f2d907ffb0b ]
The WRITE SAME(10) and (16) implementations didn't take account of the
buffer wrap required when the virtual_gb parameter is greater than 0.Fix that and rename the fake_store() function to lba2fake_store() to lessen
confusion with the global fake_storep pointer. Bump version date.Signed-off-by: Douglas Gilbert
Reported-by: Bart Van Assche
Tested by: Bart Van Assche
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit 5d8fc4a9f0eec20b6c07895022a6bea3fb6dfb38 ]
The issue to be fixed in this commit is when libfc found it received a
invalid FLOGI response from FC switch, it would return without freeing the
fc frame, which is just the skb data. This would cause memory leak if FC
switch keeps sending invalid FLOGI responses.This fix is just to make it execute `fc_frame_free(fp)` before returning
from function `fc_lport_flogi_resp`.Signed-off-by: Ming Lu
Reviewed-by: Hannes Reinecke
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin
10 Mar, 2019
1 commit
-
commit 4a067cf823d9d8e50d41cfb618011c0d4a969c72 upstream.
Up to 4.12, __scsi_error_from_host_byte() would reset the host byte to
DID_OK for various cases including DID_NEXUS_FAILURE. Commit
2a842acab109 ("block: introduce new block status code type") replaced this
function with scsi_result_to_blk_status() and removed the host-byte
resetting code for the DID_NEXUS_FAILURE case. As the line
set_host_byte(cmd, DID_OK) was preserved for the other cases, I suppose
this was an editing mistake.The fact that the host byte remains set after 4.13 is causing problems with
the sg_persist tool, which now returns success rather then exit status 24
when a RESERVATION CONFLICT error is encountered.Fixes: 2a842acab109 "block: introduce new block status code type"
Signed-off-by: Martin Wilck
Reviewed-by: Hannes Reinecke
Reviewed-by: Christoph Hellwig
Signed-off-by: Martin K. Petersen
Signed-off-by: Greg Kroah-Hartman
06 Mar, 2019
4 commits
-
[ Upstream commit fe35a40e675473eb65f2f5462b82770f324b5689 ]
Assign fc_vport to ln->fc_vport before calling csio_fcoe_alloc_vnp() to
avoid a NULL pointer dereference in csio_vport_set_state().ln->fc_vport is dereferenced in csio_vport_set_state().
Signed-off-by: Varun Prakash
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit c41f59884be5cca293ed61f3d64637dbba3a6381 ]
We cannot wait on a completion object in the lpfc_nvme_targetport structure
in the _destroy_targetport() code path because the NVMe/fc transport will
free that structure immediately after the .targetport_delete() callback.
This results in a use-after-free, and a hang if slub_debug=FZPU is enabled.Fix this by putting the completion on the stack.
Signed-off-by: Ewan D. Milne
Acked-by: James Smart
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit 7961cba6f7d8215fa632df3d220e5154bb825249 ]
We cannot wait on a completion object in the lpfc_nvme_lport structure in
the _destroy_localport() code path because the NVMe/fc transport will free
that structure immediately after the .localport_delete() callback. This
results in a use-after-free, and a hang if slub_debug=FZPU is enabled.Fix this by putting the completion on the stack.
Signed-off-by: Ewan D. Milne
Acked-by: James Smart
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
commit ffeafdd2bf0b280d67ec1a47ea6287910d271f3f upstream.
The sysfs phy_identifier attribute for a sas_end_device comes from the rphy
phy_identifier value.Currently this is not being set for rphys with an end device attached, so
we see incorrect symlinks from systemd disk/by-path:root@localhost:~# ls -l /dev/disk/by-path/
total 0
lrwxrwxrwx 1 root root 9 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0 -> ../../sdb
lrwxrwxrwx 1 root root 10 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0-part2 -> ../../sdb2
lrwxrwxrwx 1 root root 10 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0-part3 -> ../../sdc3Indeed, each sas_end_device phy_identifier value is 0:
root@localhost:/# more sys/class/sas_device/end_device-0\:0\:2/phy_identifier
0
root@localhost:/# more sys/class/sas_device/end_device-0\:0\:10/phy_identifier
0This patch fixes the discovery code to set the phy_identifier. With this,
we now get proper symlinks:root@localhost:~# ls -l /dev/disk/by-path/
total 0
lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy10-lun-0 -> ../../sdg
lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy11-lun-0 -> ../../sdh
lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy2-lun-0 -> ../../sda
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy2-lun-0-part1 -> ../../sda1
lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy3-lun-0 -> ../../sdb
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy3-lun-0-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy3-lun-0-part2 -> ../../sdb2
lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0 -> ../../sdc
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0-part2 -> ../../sdc2
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0-part3 -> ../../sdc3
lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy5-lun-0 -> ../../sdd
lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0 -> ../../sde
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0-part1 -> ../../sde1
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0-part2 -> ../../sde2
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0-part3 -> ../../sde3
lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0 -> ../../sdf
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0-part1 -> ../../sdf1
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0-part2 -> ../../sdf2
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0-part3 -> ../../sdf3Fixes: 2908d778ab3e ("[SCSI] aic94xx: new driver")
Reported-by: dann frazier
Signed-off-by: John Garry
Reviewed-by: Jason Yan
Tested-by: dann frazier
Signed-off-by: Martin K. Petersen
Signed-off-by: Greg Kroah-Hartman
27 Feb, 2019
6 commits
-
[ Upstream commit 9e8f1c79831424d30c0e3df068be7f4a244157c9 ]
In case of ->set_param() and ->bind_conn() cxgb4i driver does not wait for
cmd completion, this can create race conditions, to avoid this add
wait_for_completion().Signed-off-by: Varun Prakash
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit 9be9db9f78f52ef03ee90063730cb9d730e7032b ]
Albeit we no longer rely on those hard-coded descriptor sizes, we still use
them as our defaults, so better get it right. While adding its sysfs
entries, we forgot to update the geometry descriptor size. It is 0x48
according to UFS2.1, and wasn't changed in UFS3.0.[mkp: typo]
Fixes: c720c091222e (scsi: ufs: sysfs: geometry descriptor)
Signed-off-by: Avri Altman
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit 34a2ce887668db9dda4b56e6f155c49ac13f3e54 ]
When the driver finds invalid destination MAC for the first un-reachable
target, and before completes the PATH_REQ operation, set new ep_state to
OFFLDCONN_NONE so that as part of driver ep_poll mechanism, the upper
open-iscsi layer is notified to complete the login process on the first
un-reachable target and thus proceed login to other reachable targets.Signed-off-by: Manish Rangankar
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit ce9e7bce43526626f7cffe2e657953997870197e ]
hba->is_sys_suspended is set after successful system suspend but
not clear after successful system resume.According to current behavior, hba->is_sys_suspended will not be set if
host is runtime-suspended but not system-suspended. Thus we shall aligh the
same policy: clear this flag even if host remains runtime-suspended after
ufshcd_system_resume is successfully returned.Simply fix this flag to correct host status logs.
Signed-off-by: Stanley Chu
Reviewed-by: Avri Altman
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit cc29a1b0a3f2597ce887d339222fa85b9307706d ]
scsi_mq_setup_tags(), which is called by scsi_add_host(), calculates the
command size to allocate based on the prot_capabilities. In the isci
driver, scsi_host_set_prot() is called after scsi_add_host() so the command
size gets calculated to be smaller than it needs to be. Eventually,
scsi_mq_init_request() locates the 'prot_sdb' after the command assuming it
was sized correctly and a buffer overrun may occur.However, seeing blk_mq_alloc_rqs() rounds up to the nearest cache line
size, the mistake can go unnoticed.The bug was noticed after the struct request size was reduced by commit
9d037ad707ed ("block: remove req->timeout_list")Which likely reduced the allocated space for the request by an entire cache
line, enough that the overflow could be hit and it caused a panic, on boot,
at:RIP: 0010:t10_pi_complete+0x77/0x1c0
Call Trace:
sd_done+0xf5/0x340
scsi_finish_command+0xc3/0x120
blk_done_softirq+0x83/0xb0
__do_softirq+0xa1/0x2e6
irq_exit+0xbc/0xd0
call_function_single_interrupt+0xf/0x20
sd_done() would call scsi_prot_sg_count() which reads the number of
entities in 'prot_sdb', but seeing 'prot_sdb' is located after the end of
the allocated space it reads a garbage number and erroneously calls
t10_pi_complete().To prevent this, the calls to scsi_host_set_prot() are moved into
isci_host_alloc() before the call to scsi_add_host(). Out of caution, also
move the similar call to scsi_host_set_guard().Fixes: 3d2d75254915 ("[SCSI] isci: T10 DIF support")
Link: http://lkml.kernel.org/r/da851333-eadd-163a-8c78-e1f4ec5ec857@deltatee.com
Signed-off-by: Logan Gunthorpe
Cc: Intel SCU Linux support
Cc: Artur Paszkiewicz
Cc: "James E.J. Bottomley"
Cc: "Martin K. Petersen"
Cc: Christoph Hellwig
Cc: Jens Axboe
Cc: Jeff Moyer
Reviewed-by: Jeff Moyer
Reviewed-by: Jens Axboe
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit 72b4a0465f995175a2e22cf4a636bf781f1f28a7 ]
The return code should be check while qla4xxx_copy_from_fwddb_param fails.
Signed-off-by: YueHaibing
Acked-by: Manish Rangankar
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin
20 Feb, 2019
1 commit
-
commit e4a056987c86f402f1286e050b1dee3f4ce7c7eb upstream.
The problem is that the default for MQ is not to gather entropy, whereas
the default for the legacy queue was always to gather it. The original
attempt to fix entropy gathering for rotational disks under MQ added an
else branch in sd_read_block_characteristics(). Unfortunately, the entire
check isn't reached if the device has no characteristics VPD page. Since
this page was only introduced in SBC-3 and its optional anyway, most less
expensive rotational disks don't have one, meaning they all stopped
gathering entropy when we made MQ the default. In a wholly unrelated
change, openssl and openssh won't function until the random number
generator is initialised, meaning lots of people have been seeing large
delays before they could log into systems with default MQ kernels due to
this lack of entropy, because it now can take tens of minutes to initialise
the kernel random number generator.The fix is to set the non-rotational and add-randomness flags
unconditionally early on in the disk initialization path, so they can be
reset only if the device actually reports being non-rotational via the VPD
page.Reported-by: Mikael Pettersson
Fixes: 83e32a591077 ("scsi: sd: Contribute to randomness when running rotational device")
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley
Reviewed-by: Jens Axboe
Reviewed-by: Xuewei Zhang
Signed-off-by: Martin K. Petersen
Signed-off-by: Greg Kroah-Hartman
13 Feb, 2019
6 commits
-
commit 42caa0edabd6a0a392ec36a5f0943924e4954311 upstream.
The aic94xx driver is currently failing to load with errors like
sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:03.0/0000:02:00.3/0000:07:02.0/revision'
Because the PCI code had recently added a file named 'revision' to every
PCI device. Fix this by renaming the aic94xx revision file to
aic_revision. This is safe to do for us because as far as I can tell,
there's nothing in userspace relying on the current aic94xx revision file
so it can be renamed without breaking anything.Fixes: 702ed3be1b1b (PCI: Create revision file in sysfs)
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley
Signed-off-by: Martin K. Petersen
Signed-off-by: Greg Kroah-Hartman -
commit bb61b843ffd46978d7ca5095453e572714934eeb upstream.
Presently when an error is encountered during probe of the cxlflash
adapter, a deadlock is seen with cpu thread stuck inside
cxlflash_remove(). Below is the trace of the deadlock as logged by
khungtaskd:cxlflash 0006:00:00.0: cxlflash_probe: init_afu failed rc=-16
INFO: task kworker/80:1:890 blocked for more than 120 seconds.
Not tainted 5.0.0-rc4-capi2-kexec+ #2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/80:1 D 0 890 2 0x00000808
Workqueue: events work_for_cpu_fnCall Trace:
0x4d72136320 (unreliable)
__switch_to+0x2cc/0x460
__schedule+0x2bc/0xac0
schedule+0x40/0xb0
cxlflash_remove+0xec/0x640 [cxlflash]
cxlflash_probe+0x370/0x8f0 [cxlflash]
local_pci_probe+0x6c/0x140
work_for_cpu_fn+0x38/0x60
process_one_work+0x260/0x530
worker_thread+0x280/0x5d0
kthread+0x1a8/0x1b0
ret_from_kernel_thread+0x5c/0x80
INFO: task systemd-udevd:5160 blocked for more than 120 seconds.The deadlock occurs as cxlflash_remove() is called from cxlflash_probe()
without setting 'cxlflash_cfg->state' to STATE_PROBED and the probe thread
starts to wait on 'cxlflash_cfg->reset_waitq'. Since the device was never
successfully probed the 'cxlflash_cfg->state' never changes from
STATE_PROBING hence the deadlock occurs.We fix this deadlock by setting the variable 'cxlflash_cfg->state' to
STATE_PROBED in case an error occurs during cxlflash_probe() and just
before calling cxlflash_remove().Cc: stable@vger.kernel.org
Fixes: c21e0bbfc485("cxlflash: Base support for IBM CXL Flash Adapter")
Signed-off-by: Vaibhav Jain
Signed-off-by: Martin K. Petersen
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 65111785acccb836ec75263b03b0e33f21e74f47 ]
Problem:
- during the driver initialization, driver will poll fw
for KERNEL_UP in a 30 seconds timeout.- if the firmware is not ready after 30 seconds,
driver will not be loaded.Fix:
- change timeout from 30 seconds to 3 minutes.Reported-by: Feng Li
Reviewed-by: Ajish Koshy
Reviewed-by: Murthy Bhat
Reviewed-by: Dave Carroll
Reviewed-by: Kevin Barnett
Signed-off-by: Mahesh Rajashekhara
Signed-off-by: Don Brace
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit 7ff44499bafbd376115f0bb6b578d980f56ee13b ]
- fix race condition when a unit is deleted after an RLL,
and before we have gotten the LV_STATUS page of the unit.
- In this case we will get a standard inquiry, rather than
the desired page. This will result in a unit presented
which no longer exists.
- If we ask for LV_STATUS, insure we get LV_STATUSReviewed-by: Murthy Bhat
Reviewed-by: Mahesh Rajashekhara
Reviewed-by: Scott Teel
Reviewed-by: Kevin Barnett
Signed-off-by: Dave Carroll
Signed-off-by: Don Brace
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit b2346b5030cf9458f30a84028d9fe904b8c942a7 ]
Reviewed-by: Scott Benesh
Reviewed-by: Ajish Koshy
Reviewed-by: Murthy Bhat
Reviewed-by: Mahesh Rajashekhara
Reviewed-by: Dave Carroll
Reviewed-by: Scott Teel
Reviewed-by: Kevin Barnett
Signed-off-by: Mahesh Rajashekhara
Signed-off-by: Don Brace
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit 15bc43f31a074076f114e0b87931e3b220b7bff1 ]
Currently the time of SAS SSP connection is 1ms, which means the link
connection will fail if no IO response after this period.For some disks handling large IO (such as 512k), 1ms is not enough, so
change it to 5ms.Signed-off-by: Xiang Chen
Signed-off-by: John Garry
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin