16 Sep, 2020
1 commit
-
Pull SCSI fix from James Bottomley:
"Just one fix in libsas for a resource leak in an error path"* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: libsas: Fix error path in sas_notify_lldd_dev_found()
10 Sep, 2020
1 commit
-
In sas_notify_lldd_dev_found(), if we can't allocate the necessary
resources, then it seems like the wrong thing to mark the device as found
and to increment the reference count. None of the callers ever drop the
reference in that situation.[mkp: tweaked commit desc based on feedback from John]
Link: https://lore.kernel.org/r/20200905125836.GF183976@mwanda
Fixes: 735f7d2fedf5 ("[SCSI] libsas: fix domain_device leak")
Reviewed-by: Jason Yan
Acked-by: John Garry
Signed-off-by: Dan Carpenter
Signed-off-by: Martin K. Petersen
09 Sep, 2020
1 commit
-
Pull SCSI fixes from James Bottomley:
"Eleven fixes, mostly in drivers or minor fixes in driver related
infrastructure libraries (target, libfc and libsas).Most of the bugs fixed only show up under rare circumstances, the
exception being the endianness problem in qla2xxx which is used as a
device on some sparc systems"* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: mpt3sas: Don't call disable_irq from IRQ poll handler
scsi: megaraid_sas: Don't call disable_irq from process IRQ poll
scsi: target: iscsi: Fix hang in iscsit_access_np() when getting tpg->np_login_sem
scsi: libsas: Set data_dir as DMA_NONE if libata marks qc as NODATA
scsi: target: iscsi: Fix data digest calculation
scsi: lpfc: Update lpfc version to 12.8.0.4
scsi: lpfc: Extend the RDF FPIN Registration descriptor for additional events
scsi: lpfc: Fix FLOGI/PLOGI receive race condition in pt2pt discovery
scsi: lpfc: Fix setting IRQ affinity with an empty CPU mask
scsi: qla2xxx: Fix regression on sparc64
scsi: libfc: Fix for double free()
scsi: pm8001: Fix memleak in pm8001_exec_internal_task_abort
03 Sep, 2020
2 commits
-
disable_irq() might sleep, replace it with disable_irq_nosync(). For
synchronisation 'irq_poll_scheduled' is sufficientFixes: 320e77acb3 scsi: mpt3sas: Irq poll to avoid CPU hard lockups
Link: https://lore.kernel.org/r/20200901145026.12174-1-thenzl@redhat.com
Signed-off-by: Tomas Henzl
Signed-off-by: Martin K. Petersen -
disable_irq() might sleep. Replace it with disable_irq_nosync() which is
sufficient as irq_poll_scheduled protects against concurrently running
complete_cmd_fusion() from megasas_irqpoll() and megasas_isr_fusion().Link: https://lore.kernel.org/r/20200827165332.8432-1-thenzl@redhat.com
Fixes: a6ffd5bf681 scsi: megaraid_sas: Call disable_irq from process IRQ poll
Signed-off-by: Tomas Henzl
Signed-off-by: Martin K. Petersen
02 Sep, 2020
2 commits
-
It was discovered that sdparm will fail when attempting to disable write
cache on a SATA disk connected via libsas.In the ATA command set the write cache state is controlled through the SET
FEATURES operation. This is roughly corresponds to MODE SELECT in SCSI and
the latter command is what is used in the SCSI-ATA translation layer. A
subtle difference is that a MODE SELECT carries data whereas SET FEATURES
is defined as a non-data command in ATA.Set the DMA data direction to DMA_NONE if the requested ATA command is
identified as non-data.[mkp: commit desc]
Fixes: fa1c1e8f1ece ("[SCSI] Add SATA support to libsas")
Link: https://lore.kernel.org/r/1598426666-54544-1-git-send-email-luojiaxing@huawei.com
Reviewed-by: John Garry
Reviewed-by: Jason Yan
Signed-off-by: Luo Jiaxing
Signed-off-by: Martin K. Petersen -
Pull SCSI fixes from James Bottomley:
"Three minor fixes, all in drivers"* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: scsi_debug: Remove superfluous close zone in resp_open_zone()
scsi: libcxgbi: Fix a use after free in cxgbi_conn_xmit_pdu()
scsi: qedf: Fix null ptr reference in qedf_stag_change_work
01 Sep, 2020
7 commits
-
Update lpfc version to 12.8.0.4
Link: https://lore.kernel.org/r/20200828175332.130300-5-james.smart@broadcom.com
Co-developed-by: Dick Kennedy
Signed-off-by: Dick Kennedy
Signed-off-by: James Smart
Signed-off-by: Martin K. Petersen -
Currently the driver registers for Link Integrity events only.
This patch adds registration for the following FPIN types:
- Delivery Notifications
- Congestion Notification
- Peer Congestion NotificationLink: https://lore.kernel.org/r/20200828175332.130300-4-james.smart@broadcom.com
Co-developed-by: Dick Kennedy
Signed-off-by: Dick Kennedy
Signed-off-by: James Smart
Signed-off-by: Martin K. Petersen -
The driver is unable to successfully login with remote device. During pt2pt
login, the driver completes its FLOGI request with the remote device having
WWN precedence. The remote device issues its own (delayed) FLOGI after
accepting the driver's and, upon transmitting the FLOGI, immediately
recognizes it has already processed the driver's FLOGI thus it transitions
to sending a PLOGI before waiting for an ACC to its FLOGI.In the driver, the FLOGI is received and an ACC sent, followed by the PLOGI
being received and an ACC sent. The issue is that the PLOGI reception
occurs before the response from the adapter from the FLOGI ACC is
received. Processing of the PLOGI sets state flags to perform the REG_RPI
mailbox command and proceed with the rest of discovery on the port. The
same completion routine used by both FLOGI and PLOGI is generic in
nature. One of the things it does is clear flags, and those flags happen to
drive the rest of discovery. So what happened was the PLOGI processing set
the flags, the FLOGI ACC completion cleared them, thus when the PLOGI ACC
completes it doesn't see the flags and stops.Fix by modifying the generic completion routine to not clear the rest of
discovery flag (NLP_ACC_REGLOGIN) unless the completion is also associated
with performing a mailbox command as part of its handling. For things such
as FLOGI ACC, there isn't a subsequent action to perform with the adapter,
thus there is no mailbox cmd ptr. PLOGI ACC though will perform REG_RPI
upon completion, thus there is a mailbox cmd ptr.Link: https://lore.kernel.org/r/20200828175332.130300-3-james.smart@broadcom.com
Co-developed-by: Dick Kennedy
Signed-off-by: Dick Kennedy
Signed-off-by: James Smart
Signed-off-by: Martin K. Petersen -
Some systems are reporting the following log message during driver unload
or system shutdown:ics_rtas_set_affinity: No online cpus in the mask
A prior commit introduced the writing of an empty affinity mask in calls to
irq_set_affinity_hint() when disabling interrupts or when there are no
remaining online CPUs to service an eq interrupt. At least some ppc64
systems are checking whether affinity masks are empty or not.Do not call irq_set_affinity_hint() with an empty CPU mask.
Fixes: dcaa21367938 ("scsi: lpfc: Change default IRQ model on AMD architectures")
Link: https://lore.kernel.org/r/20200828175332.130300-2-james.smart@broadcom.com
Cc: # v5.5+
Co-developed-by: Dick Kennedy
Signed-off-by: Dick Kennedy
Signed-off-by: James Smart
Signed-off-by: Martin K. Petersen -
Commit 98aee70d19a7 ("qla2xxx: Add endianizer to max_payload_size
modifier.") in 2014 broke qla2xxx on sparc64, e.g. as in the Sun Blade 1000
/ 2000. Unbreak by partial revert to fix endianness in nvram firmware
default initialization. Also mark the second frame_payload_size in nvram_t
__le16 to avoid new sparse warnings.Link: https://lore.kernel.org/r/20200827.222729.1875148247374704975.rene@exactcode.com
Fixes: 98aee70d19a7 ("qla2xxx: Add endianizer to max_payload_size modifier.")
Reviewed-by: Himanshu Madhani
Reviewed-by: Bart Van Assche
Acked-by: Arun Easi
Signed-off-by: René Rebe
Signed-off-by: Martin K. Petersen -
Fix for '&fp->skb' double free.
Link:
https://lore.kernel.org/r/20200825093940.19612-1-jhasan@marvell.com
Reported-by: Dan Carpenter
Signed-off-by: Javed Hasan
Signed-off-by: Martin K. Petersen -
When pm8001_tag_alloc() fails, task should be freed just like it is done in
the subsequent error paths.Link: https://lore.kernel.org/r/20200823091453.4782-1-dinghao.liu@zju.edu.cn
Acked-by: Jack Wang
Signed-off-by: Dinghao Liu
Signed-off-by: Martin K. Petersen
25 Aug, 2020
3 commits
-
resp_open_zone() always calls zbc_open_zone() with parameter explicit set
to true.If zbc_open_zone() is called with parameter explicit set to true, and the
current zone state is implicit open, it will call zbc_close_zone() on the
zone before proceeding.Therefore, there is no need for resp_open_zone() to call zbc_close_zone()
on an implicitly open zone before calling zbc_open_zone().Remove superfluous close zone in resp_open_zone().
Link: https://lore.kernel.org/r/20200821130007.39938-1-niklas.cassel@wdc.com
Reviewed-by: Damien Le Moal
Signed-off-by: Niklas Cassel
Signed-off-by: Martin K. Petersen -
We accidentally move this logging printk after the free, but that leads to
a use after free.Link: https://lore.kernel.org/r/20200824085933.GD208317@mwanda
Fixes: e33c2482289b ("scsi: cxgb4i: Add support for iSCSI segmentation offload")
Signed-off-by: Dan Carpenter
Signed-off-by: Martin K. Petersen -
Link: https://lore.kernel.org/r/20200824033436.45570-1-yebin10@huawei.com
Acked-by: Saurav Kashyap
Signed-off-by: Ye Bin
Signed-off-by: Martin K. Petersen
24 Aug, 2020
1 commit
-
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through
Signed-off-by: Gustavo A. R. Silva
18 Aug, 2020
21 commits
-
FCP T10-PI and NVMe features are independent of each other. This patch
allows both features to co-exist.This reverts commit 5da05a26b8305a625bc9d537671b981795b46dab.
Link: https://lore.kernel.org/r/20200806111014.28434-12-njavali@marvell.com
Fixes: 5da05a26b830 ("scsi: qla2xxx: Disable T10-DIF feature with FC-NVMe during probe")
Reviewed-by: Himanshu Madhani
Signed-off-by: Quinn Tran
Signed-off-by: Nilesh Javali
Signed-off-by: Martin K. Petersen -
FCoE adapter initialization failed for ISP8021 with the following patch
applied. In addition, reproduction of the issue the patch originally tried
to address has been unsuccessful.This reverts commit 3cb182b3fa8b7a61f05c671525494697cba39c6a.
Link: https://lore.kernel.org/r/20200806111014.28434-11-njavali@marvell.com
Reviewed-by: Himanshu Madhani
Signed-off-by: Saurav Kashyap
Signed-off-by: Nilesh Javali
Signed-off-by: Martin K. Petersen -
NVMEAsync command is being submitted to QLA while the same NVMe controller
is in the middle of reset. The reset path has deleted the association and
freed aen_op->fcp_req.private. Add a check for this private pointer before
issuing the command....
6 [ffffb656ca11fce0] page_fault at ffffffff8c00114e
[exception RIP: qla_nvme_post_cmd+394]
RIP: ffffffffc0d012ba RSP: ffffb656ca11fd98 RFLAGS: 00010206
RAX: ffff8fb039eda228 RBX: ffff8fb039eda200 RCX: 00000000000da161
RDX: ffffffffc0d4d0f0 RSI: ffffffffc0d26c9b RDI: ffff8fb039eda220
RBP: 0000000000000013 R8: ffff8fb47ff6aa80 R9: 0000000000000002
R10: 0000000000000000 R11: ffffb656ca11fdc8 R12: ffff8fb27d04a3b0
R13: ffff8fc46dd98a58 R14: 0000000000000000 R15: ffff8fc4540f0000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
7 [ffffb656ca11fe08] nvme_fc_start_fcp_op at ffffffffc0241568 [nvme_fc]
8 [ffffb656ca11fe50] nvme_fc_submit_async_event at ffffffffc0241901 [nvme_fc]
9 [ffffb656ca11fe68] nvme_async_event_work at ffffffffc014543d [nvme_core]
10 [ffffb656ca11fe98] process_one_work at ffffffff8b6cd437
11 [ffffb656ca11fed8] worker_thread at ffffffff8b6cdcef
12 [ffffb656ca11ff10] kthread at ffffffff8b6d3402
13 [ffffb656ca11ff50] ret_from_fork at ffffffff8c000255--
PID: 37824 TASK: ffff8fb033063d80 CPU: 20 COMMAND: "kworker/u97:451"
0 [ffffb656ce1abc28] __schedule at ffffffff8be629e3
1 [ffffb656ce1abcc8] schedule at ffffffff8be62fe8
2 [ffffb656ce1abcd0] schedule_timeout at ffffffff8be671ed
3 [ffffb656ce1abd70] wait_for_completion at ffffffff8be639cf
4 [ffffb656ce1abdd0] flush_work at ffffffff8b6ce2d5
5 [ffffb656ce1abe70] nvme_stop_ctrl at ffffffffc0144900 [nvme_core]
6 [ffffb656ce1abe80] nvme_fc_reset_ctrl_work at ffffffffc0243445 [nvme_fc]
7 [ffffb656ce1abe98] process_one_work at ffffffff8b6cd437
8 [ffffb656ce1abed8] worker_thread at ffffffff8b6cdb50
9 [ffffb656ce1abf10] kthread at ffffffff8b6d3402
10 [ffffb656ce1abf50] ret_from_fork at ffffffff8c000255Link: https://lore.kernel.org/r/20200806111014.28434-10-njavali@marvell.com
Reviewed-by: Himanshu Madhani
Signed-off-by: Quinn Tran
Signed-off-by: Nilesh Javali
Signed-off-by: Martin K. Petersen -
OS boot during Boot from SAN was stuck at dracut emergency shell after
enabling NVMe driver parameter. For non-MQ support the driver was enabling
MQ. Add a check to confirm if FW supports MQ.Link: https://lore.kernel.org/r/20200806111014.28434-9-njavali@marvell.com
Reviewed-by: Himanshu Madhani
Signed-off-by: Saurav Kashyap
Signed-off-by: Nilesh Javali
Signed-off-by: Martin K. Petersen -
qla_nvme_register_hba() puts out a warning when there are not enough queue
pairs available for FC-NVME. Just fail the NVME registration rather than a
WARNING + call Trace.Link: https://lore.kernel.org/r/20200806111014.28434-8-njavali@marvell.com
Reviewed-by: Himanshu Madhani
Signed-off-by: Arun Easi
Signed-off-by: Nilesh Javali
Signed-off-by: Martin K. Petersen -
ql2xextended_error_logging can now be set to 1 to get the default mask
value, as opposed to at module load time only.Link: https://lore.kernel.org/r/20200806111014.28434-7-njavali@marvell.com
Reviewed-by: Himanshu Madhani
Signed-off-by: Arun Easi
Signed-off-by: Nilesh Javali
Signed-off-by: Martin K. Petersen -
Update debug level and message for ELS IOCB done.
Link: https://lore.kernel.org/r/20200806111014.28434-6-njavali@marvell.com
Reviewed-by: Himanshu Madhani
Signed-off-by: Quinn Tran
Signed-off-by: Nilesh Javali
Signed-off-by: Martin K. Petersen -
Multipath errors were seen during failback due to login timeout. The
remote device sent LOGO, the local host tore down the session and did
relogin. The RSCN arrived indicates remote device is going through failover
after which the relogin is in a 20s timeout phase. At this point the
driver is stuck in the relogin process. Add a fix to delete the session as
part of abort/flush the login.Link: https://lore.kernel.org/r/20200806111014.28434-5-njavali@marvell.com
Reviewed-by: Himanshu Madhani
Signed-off-by: Quinn Tran
Signed-off-by: Nilesh Javali
Signed-off-by: Martin K. Petersen -
Correct the supported speeds for 16G Mezz card.
Link: https://lore.kernel.org/r/20200806111014.28434-4-njavali@marvell.com
Reviewed-by: Himanshu Madhani
Signed-off-by: Quinn Tran
Signed-off-by: Nilesh Javali
Signed-off-by: Martin K. Petersen -
Perform implicit logout to flush I/O on zone disable.
Link: https://lore.kernel.org/r/20200806111014.28434-3-njavali@marvell.com
Reviewed-by: Himanshu Madhani
Signed-off-by: Quinn Tran
Signed-off-by: Himanshu Madhani
Signed-off-by: Nilesh Javali
Signed-off-by: Martin K. Petersen -
On Zone Disable, certain switches would ignore all commands. This causes
timeout for both switch scan command and abort of that command. On
detection of this condition, all sessions will be shutdown.Link: https://lore.kernel.org/r/20200806111014.28434-2-njavali@marvell.com
Reviewed-by: Himanshu Madhani
Signed-off-by: Quinn Tran
Signed-off-by: Himanshu Madhani
Signed-off-by: Nilesh Javali
Signed-off-by: Martin K. Petersen -
Improves readability of qla_mbx.c.
Link: https://lore.kernel.org/r/20200805200546.22497-1-ematsumiya@suse.de
Reviewed-by: Himanshu Madhani
Reviewed-by: Roman Bolshakov
Signed-off-by: Enzo Matsumiya
Signed-off-by: Martin K. Petersen -
John Garry reported 'sdebug_q_cmd_complete: scp is NULL' failures that were
mainly seen on aarch64 machines (e.g. RPi 4 with four A72 CPUs). The
problem was tracked down to a missing critical section on a "short circuit"
path. Namely, the time to process the current command so far has already
exceeded the requested command duration (i.e. the number of nanoseconds in
the ndelay parameter).The random=1 parameter setting was pivotal in finding this error. The
failure scenario involved first taking that "short circuit" path (due to a
very short command duration) and then taking the more likely
hrtimer_start() path (due to a longer command duration). With random=1 each
command's duration is taken from the uniformly distributed [0..ndelay)
interval. The fio utility also helped by reliably generating the error
scenario at about once per minute on a RPi 4 (64 bit OS).Link: https://lore.kernel.org/r/20200813155738.109298-1-dgilbert@interlog.com
Reported-by: John Garry
Reviewed-by: Lee Duncan
Signed-off-by: Douglas Gilbert
Signed-off-by: Martin K. Petersen -
If the bit corresponding to a task in the Doorbell register has been
cleared, no need to poll the status of the task on the device side and to
send an Abort Task TM. Instead, let it directly goto cleanup.In addition, to keep original debug output, move the goto below the debug
print.Link: https://lore.kernel.org/r/20200811141859.27399-3-huobean@gmail.com
Reviewed-by: Stanley Chu
Reviewed-by: Can Guo
Signed-off-by: Bean Huo
Signed-off-by: Martin K. Petersen -
If somehow no interrupt notification is raised for a completed request and
its doorbell bit is cleared by host, UFS driver needs to cleanup its
outstanding bit in ufshcd_abort(). Otherwise, system may behave abnormally
in the following scenario:After ufshcd_abort() returns, this request will be requeued by SCSI layer
with its outstanding bit set. Any future completed request will trigger
ufshcd_transfer_req_compl() to handle all "completed outstanding bits". At
this time the "abnormal outstanding bit" will be detected and the "requeued
request" will be chosen to execute request post-processing flow. This is
wrong because this request is still "alive".Link: https://lore.kernel.org/r/20200811141859.27399-2-huobean@gmail.com
Reviewed-by: Can Guo
Acked-by: Avri Altman
Signed-off-by: Stanley Chu
Signed-off-by: Bean Huo
Signed-off-by: Martin K. Petersen -
For shared interrupts, the interrupt status might be zero, so check that
first.Link: https://lore.kernel.org/r/20200811133936.19171-2-adrian.hunter@intel.com
Reviewed-by: Avri Altman
Signed-off-by: Adrian Hunter
Signed-off-by: Martin K. Petersen -
The interrupt might be shared, in which case it is not an error for the
interrupt handler to be called when the interrupt status is zero, so don't
print the message unless there was enabled interrupt status.Link: https://lore.kernel.org/r/20200811133936.19171-1-adrian.hunter@intel.com
Fixes: 9333d7757348 ("scsi: ufs: Fix irq return code")
Reviewed-by: Avri Altman
Signed-off-by: Adrian Hunter
Signed-off-by: Martin K. Petersen -
Intel EHL UFS host controller advertises auto-hibernate capability but it
does not work correctly. Add a quirk for that.[mkp: checkpatch fix]
Link: https://lore.kernel.org/r/20200810141024.28859-1-adrian.hunter@intel.com
Fixes: 8c09d7527697 ("scsi: ufshdc-pci: Add Intel PCI IDs for EHL")
Acked-by: Stanley Chu
Signed-off-by: Adrian Hunter
Signed-off-by: Martin K. Petersen -
Fix incorrect calculation of "ms" based waiting time in function
ufs_mtk_setup_clocks().Link: https://lore.kernel.org/r/20200809055702.20140-1-stanley.chu@mediatek.com
Fixes: 9006e3986f66 ("scsi: ufs-mediatek: Do not gate clocks if auto-hibern8 is not entered yet")
Reviewed-by: Avri Altman
Signed-off-by: Stanley Chu
Signed-off-by: Martin K. Petersen -
In ufshcd_suspend(), after clk-gating is suspended and link is set
as Hibern8 state, ufshcd_hold() is still possibly invoked before
ufshcd_suspend() returns. For example, MediaTek's suspend vops may
issue UIC commands which would call ufshcd_hold() during the command
issuing flow.Now if UFSHCD_CAP_HIBERN8_WITH_CLK_GATING capability is enabled,
then ufshcd_hold() may enter infinite loops because there is no
clk-ungating work scheduled or pending. In this case, ufshcd_hold()
shall just bypass, and keep the link as Hibern8 state.Link: https://lore.kernel.org/r/20200809050734.18740-1-stanley.chu@mediatek.com
Reviewed-by: Avri Altman
Co-developed-by: Andy Teng
Signed-off-by: Andy Teng
Signed-off-by: Stanley Chu
Signed-off-by: Martin K. Petersen -
Fix to return error code PTR_ERR() from the error handling case instead of
0.Link: https://lore.kernel.org/r/20200806070135.67797-1-jingxiangfeng@huawei.com
Fixes: 22617e216331 ("scsi: ufs: ti-j721e-ufs: Fix unwinding of pm_runtime changes")
Reviewed-by: Avri Altman
Signed-off-by: Jing Xiangfeng
Signed-off-by: Martin K. Petersen
15 Aug, 2020
1 commit
-
Pull more SCSI updates from James Bottomley:
"This is the set of patches which arrived too late to stabilise in
-next for the first pull.It's really just an lpfc driver update and an assortment of minor
fixes, all in drivers. The only core update is to the zone block
device driver, which isn't the one most people use"* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: lpfc: Update lpfc version to 12.8.0.3
scsi: lpfc: Fix LUN loss after cable pull
scsi: lpfc: Fix validation of bsg reply lengths
scsi: lpfc: Fix retry of PRLI when status indicates its unsupported
scsi: lpfc: Fix oops when unloading driver while running mds diags
scsi: lpfc: Fix RSCN timeout due to incorrect gidft counter
scsi: lpfc: Fix no message shown for lpfc_hdw_queue out of range value
scsi: lpfc: Fix FCoE speed reporting
scsi: lpfc: Add missing misc_deregister() for lpfc_init()
scsi: lpfc: nvmet: Avoid hang / use-after-free again when destroying targetport
scsi: scsi_transport_sas: Add spaces around binary operator "|"
scsi: sd_zbc: Improve zone revalidation
scsi: libfc: Free skb in fc_disc_gpn_id_resp() for valid cases
scsi: fcoe: Memory leak fix in fcoe_sysfs_fcf_del()
scsi: target: Make iscsit_register_transport() return void