10 May, 2016
1 commit
-
…kp/scsi into for-4.7-zac
Pulling in the dependencies for further ZAC changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
06 May, 2016
2 commits
-
I actually read the error messages in my logs, and successful
initialization is not an error.Arguably these log lines could be deleted entirely.
Signed-off-by: Andy Lutomirski
Reviewed-by: Hannes Reinicke
Acked-by: Sumit Saxena
Signed-off-by: Martin K. Petersen -
When a cxlflash adapter goes into EEH recovery and multiple processes
(each having established its own context) are active, the EEH recovery
can hang if the processes attempt to recover in parallel. The symptom
logged after a couple of minutes is:INFO: task eehd:48 blocked for more than 120 seconds.
Not tainted 4.5.0-491-26f710d+ #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
eehd 0 48 2
Call Trace:
__switch_to+0x2f0/0x410
__schedule+0x300/0x980
schedule+0x48/0xc0
rwsem_down_write_failed+0x294/0x410
down_write+0x88/0xb0
cxlflash_pci_error_detected+0x100/0x1c0 [cxlflash]
cxl_vphb_error_detected+0x88/0x110 [cxl]
cxl_pci_error_detected+0xb0/0x1d0 [cxl]
eeh_report_error+0xbc/0x130
eeh_pe_dev_traverse+0x94/0x160
eeh_handle_normal_event+0x17c/0x450
eeh_handle_event+0x184/0x370
eeh_event_handler+0x1c8/0x1d0
kthread+0x110/0x130
ret_from_kernel_thread+0x5c/0xa4
INFO: task blockio:33215 blocked for more than 120 seconds.Not tainted 4.5.0-491-26f710d+ #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
blockio 0 33215 33213
Call Trace:
0x1 (unreliable)
__switch_to+0x2f0/0x410
__schedule+0x300/0x980
schedule+0x48/0xc0
rwsem_down_read_failed+0x124/0x1d0
down_read+0x68/0x80
cxlflash_ioctl+0x70/0x6f0 [cxlflash]
scsi_ioctl+0x3b0/0x4c0
sg_ioctl+0x960/0x1010
do_vfs_ioctl+0xd8/0x8c0
SyS_ioctl+0xd4/0xf0
system_call+0x38/0xb4
INFO: task eehd:48 blocked for more than 120 seconds.The hang is because of a 3 way dead-lock:
Process A holds the recovery mutex, and waits for eehd to complete.
Process B holds the semaphore and waits for the recovery mutex.
eehd waits for semaphore.The fix is to have Process B above release the semaphore before
attempting to acquire the recovery mutex. This will allow
eehd to proceed to completion.Signed-off-by: Manoj N. Kumar
Reviewed-by: Matthew R. Ochs
Signed-off-by: Martin K. Petersen
04 May, 2016
2 commits
-
Based on "[PATH V2] scsi_debug: rework resp_report_luns" patch
sent by Tomas Winkler on Thursday, 26 Feb 2015. His notes:
1. Remove duplicated boundary checks which simplify the fill-in
loop
2. Use more of scsi generic API
Replace fixed length response array a with heap allocation
allowing up to 256 normal LUNs per target.Signed-off-by: Douglas Gilbert
Reviewed-by: Hannes Reinicke
Reviewed-by: Tomas Winkler
Reviewed-by: Bart Van Assche
Signed-off-by: Martin K. Petersen -
Use TYPE_* constants for SCSI peripheral device types instead of
numbers. Further cleanups requested by checkpatch.pl.Signed-off-by: Douglas Gilbert
Reviewed-by: Hannes Reinicke
Reviewed-by: Bart Van Assche
Signed-off-by: Martin K. Petersen
30 Apr, 2016
25 commits
-
The most common commands in normal use are the READ and WRITE SCSI
commands. Use likely and unlikely hints along the path taken by these
commands. Rename check_readiness() to make_ua() and remove associated
dead code. Rename devInfoReg() to find_build_dev_info().Signed-off-by: Douglas Gilbert
Reviewed-by: Hannes Reinicke
Signed-off-by: Martin K. Petersen -
Group most defines together first; followed by struct definitions and
then table and variable definitions. Normalize all function headers.[mkp: Corrected hex value in WP/DPOFUA MODE SENSE comment]
Signed-off-by: Douglas Gilbert
Reviewed-by: Hannes Reinicke
Signed-off-by: Martin K. Petersen -
When a negative value was placed in the delay parameter, a tasklet was
scheduled. Change the tasklet to a work queue. Previously a delay of -1
scheduled a high priority tasklet; since there are no high priority work
queues, treat -1 like other negative values in delay and schedule a work
item.Signed-off-by: Douglas Gilbert
Reviewed-by: Hannes Reinicke
Signed-off-by: Martin K. Petersen -
Add 'j' to delay names to make it clearer that its unit is jiffies and
to differentiate it from sdebug_ndelay whose unit is nanoseconds.Signed-off-by: Douglas Gilbert
Reviewed-by: Hannes Reinicke
Signed-off-by: Martin K. Petersen -
The driver supports two command delay interfaces, the original one whose
unit is a jiffy, and a newer one whose unit is a nanosecond. Each had
different implementations. Keep both interfaces but simplify the
implemenation to use a single delay mechanism based on high resolution
timers.Signed-off-by: Douglas Gilbert
Reviewed-by: Hannes Reinicke
Signed-off-by: Martin K. Petersen -
Remove logic to optionally hold host_lock while each command is
queued. Keep module and sysfs host_lock parameters for backward
compatibility. Note in module parameter description that host_lock is
ignored.Signed-off-by: Douglas Gilbert
Reviewed-by: Hannes Reinicke
Signed-off-by: Martin K. Petersen -
Shorten file scope static and constant names. Use more
get/put_unaligned calls to hide bit banging. Introduce
sdebug_verbose boolean to replace frequent masking of
option bit flags. Add GPL and bump version.[mkp: Use logical instead of bitwise OR for LBP VPD flags]
Signed-off-by: Douglas Gilbert
Reviewed-by: Hannes Reinecke
Signed-off-by: Martin K. Petersen -
Reviewed-by: Gerry Morong
Signed-off-by: Don Brace
Reviewed-by: Johannes Thumshirn
Signed-off-by: Martin K. Petersen -
Need to report HBA device removal faster than the
event handler polling interval.Stop I/O to the removed disk and wait for all
I/O operations to flush before removing the device.Reviewed-by: Scott Teel
Reviewed-by: Kevin Barnett
Signed-off-by: Don Brace
Reviewed-by: Johannes Thumshirn
Signed-off-by: Martin K. Petersen -
set offload_to_be_enabled to 0 when an ioaccel2 error is processed.
Before, an ioaccel completion error would turn of ioaccel but a rescan
would turn it back on again.Reviewed-by: Scott Teel
Reviewed-by: Kevin Barnett
Signed-off-by: Don Brace
Reviewed-by: Johannes Thumshirn
Signed-off-by: Martin K. Petersen -
offload_to_be_enabled also needs to be set to 0 during a state
change.Reviewed-by: Scott Teel
Reviewed-by: Kevin Barnett
Signed-off-by: Don Brace
Reviewed-by: Johannes Thumshirn
Signed-off-by: Martin K. Petersen -
faulty drives can cause the driver to hang during a
scan operation.Reviewed-by: Scott Teel
Reviewed-by: Kevin Barnett
Signed-off-by: Don Brace
Reviewed-by: Johannes Thumshirn
Signed-off-by: Martin K. Petersen -
There have been companies requesting a sysfs entry
to obtain the sas address of device.Reviewed-by: Scott Teel
Reviewed-by: Kevin Barnett
Signed-off-by: Don Brace
Reviewed-by: Johannes Thumshirn
Signed-off-by: Martin K. Petersen -
The driver was calling scsi_scan_host before enabling interrupts.
This has gone unnoticed except for customers running in intx mode.
Calling scsi_scan_host before interrupts are enabled causes
"irq XX: nobody cared" messages and the driver to hang.This patch enables interrupts before the call to scsi_scan_host.
Reported-by: Piotr Karbowski
Reviewed-by: Scott Teel
Reviewed-by: Kevin Barnett
Signed-off-by: Don Brace
Reviewed-by: Johannes Thumshirn
Signed-off-by: Martin K. Petersen -
Signed-off-by: Raghava Aditya Renukunta
Reviewed-by: Johannes Thumshirn
Signed-off-by: Martin K. Petersen -
When KDUMP is triggered the driver first talks to the firmware in INTX
mode, but the adapter firmware is still in MSIX mode. Therefore the first
driver command hangs since the driver is waiting for an INTX response and
firmware gives a MSIX response. If when the OS is installed on a RAID
drive created by the adapter KDUMP will hang since the driver does not
receive a response in sync mode.Fixed by: Change the firmware to INTX mode if it is in MSIX mode before
sending the first sync command.Cc: stable@vger.kernel.org
Signed-off-by: Raghava Aditya Renukunta
Reviewed-by: Johannes Thumshirn
Signed-off-by: Martin K. Petersen -
Currently driver completes double completed or spurious interrupted fibs.
This is not necessary and causes the SCSI mid layer to issue aborts and
resets, since completing a fib prematurely might trigger a race condition
resulting in the driver not calling the scsi_done callback.Fixed by removing the call to fib complete.
Signed-off-by: Raghava Aditya Renukunta
Reviewed-by: Johannes Thumshirn
Signed-off-by: Martin K. Petersen -
Firmware AIF messages about cache loss and data recovery are being missed
by the driver since currently they are not captured but rather let go.
This patch to capture those messages and log them for the user.Signed-off-by: Raghava Aditya Renukunta
Reviewed-by: Johannes Thumshirn
Signed-off-by: Martin K. Petersen -
Typically under error conditions, it is possible for aac_command_thread()
to miss the wakeup from kthread_stop() and go back to sleep, causing it
to hang aac_shutdown.In the observed scenario, the adapter is not functioning correctly and so
aac_fib_send() never completes (or time-outs depending on how it was
called). Shortly after aac_command_thread() starts it performs
aac_fib_send(SendHostTime) which hangs. When aac_probe_one
/aac_get_adapter_info send time outs, kthread_stop is called which breaks
the command thread out of it's hang.The code will still go back to sleep in schedule_timeout() without
checking kthread_should_stop() so it causes aac_probe_one to hang until
the schedule_timeout() which is 30 minutes.Fixed by: Adding another kthread_should_stop() before schedule_timeout()
Cc: stable@vger.kernel.org
Signed-off-by: Raghava Aditya Renukunta
Reviewed-by: Johannes Thumshirn
Signed-off-by: Martin K. Petersen -
As the firmware for series 6, 7, 8 cards does not support msi, remove it
in the driverSigned-off-by: Raghava Aditya Renukunta
Reviewed-by: Johannes Thumshirn
Signed-off-by: Martin K. Petersen -
aac_fib_send has a special function case for initial commands during
driver initialization using wait < 0(pseudo sync mode). In this case,
the command does not sleep but rather spins checking for timeout.This
loop is calls cpu_relax() in an attempt to allow other processes/threads
to use the CPU, but this function does not relinquish the CPU and so the
command will hog the processor. This was observed in a KDUMP
"crashkernel" and that prevented the "command thread" (which is
responsible for completing the command from being timed out) from
starting because it could not get the CPU.Fixed by replacing "cpu_relax()" call with "schedule()"
Cc: stable@vger.kernel.org
Signed-off-by: Raghava Aditya Renukunta
Reviewed-by: Johannes Thumshirn
Signed-off-by: Martin K. Petersen -
The adapter has to be started after updating the number of MSIX Vectors
Fixes: ecc479e00db8 (aacraid: Set correct MSIX count for EEH recovery)
Cc: stable@vger.kernel.org
Signed-off-by: Raghava Aditya Renukunta
Reviewed-by: Johannes Thumshirn
Signed-off-by: Martin K. Petersen -
Suggested-by: Seymour, Shane M
Signed-off-by: Raghava Aditya Renukunta
Reviewed-by: Johannes Thumshirn
Signed-off-by: Martin K. Petersen -
Current driver checks for NULL return from aac_fib_alloc_tag, but it not
possible for it to return NULL.Fixed by: Remove all the checks for NULL returns from aac_fib_alloc_tag
Suggested-by: Tomas Henzl
Signed-off-by: Raghava Aditya Renukunta
Signed-off-by: Martin K. Petersen -
mptsas_smp_handler() checks for dma mapping errors by comparison
returned address with zero, while pci_dma_mapping_error() should be
used.Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov
Acked-by: Sathya Prakash Veerichetty
Signed-off-by: Martin K. Petersen
26 Apr, 2016
6 commits
-
The file atari_NCR5380.c has been removed from the tree so remove it
from the MAINTAINERS file as well.While we are here, add the file dtc3x80.txt as it is only relevant to
the dtc driver.Signed-off-by: Finn Thain
Signed-off-by: Martin K. Petersen -
Some drives set the ILI flag together with MEDIUM ERROR sense code.
Clear the ILI flag in this case so that the medium error will be
handled. The problem was reported by Maurizio Lombardi.Signed-off-by: Kai Mäkisara
Reviewed-by: Laurence Oberman
Signed-off-by: Martin K. Petersen -
Remove incorrect lockdep assertion from lpfc_sli_hbqbuf_find() which
acquires the hbalock itself. Fix the comment which resulted in this
mistake.Fixes: 1c2ba475eb0e ("lpfc: Add lockdep assertions")
Signed-off-by: Sebastian Herbszt
Reviewed-by: Johannes Thumshirn
Signed-off-by: Martin K. Petersen -
Presumably it isn't possible to have empty lists here, but my static
checker doesn't know that and complains that "ep" can be used
uninitialized.Signed-off-by: Dan Carpenter
Acked-by: Nilesh Javali
Signed-off-by: Martin K. Petersen -
This has only called from show_sas_rphy_enclosure_identifier(). The
caller expects that we set an identifier, otherwise it uses an
uninitialized variable.[mkp: fixed typo]
Signed-off-by: Dan Carpenter
Acked-by: Don Brace
Signed-off-by: Martin K. Petersen -
Firmware events are queued up using the fw_event_work's struct work, not
its delayed_work member. The initial driver for SAS2 controllers had
handled firmware reset using the rescan barrier and was later redesigned
through "mpt2sas: [Resend] Host Reset code cleanup". The delayed_work
variables are now unused and may provoke CONFIG_DEBUG_OBJECTS_TIMERS
"assert_init not available" false warnings in
_scsih_fw_event_cleanup_queue.Cleanup fw_event_work's unused entries, update its kerneldoc, and
update _scsih_fw_event_cleanup_queue accordingly.Fixes: 146b16c8071f (mpt3sas: Refcount fw_events and fix unsafe list usage)
Signed-off-by: Joe Lawrence
Acked-by: Chaitra P B
Signed-off-by: Martin K. Petersen
16 Apr, 2016
4 commits
-
Add custom version of function to allocate device,
alloc_dev_quirk_v2_hw(). For sata devices the device id bit0 should be
0.Signed-off-by: John Garry
Reviewed-by: Hannes Reinicke
Signed-off-by: Martin K. Petersen -
Add v2 hw custom function slot_index_alloc_quirk_v2_hw(). SAS devices
should have IPTT bit0 equal to 1.Signed-off-by: John Garry
Reviewed-by: Hannes Reinicke
Signed-off-by: Martin K. Petersen -
Add methods to use HW specific versions of functions to allocate slot
and device. HW specific methods are permitted to workaround device id
vs IPTT collision issue in v2 hw.Signed-off-by: John Garry
Reviewed-by: Hannes Reinicke
Signed-off-by: Martin K. Petersen -
Signed-off-by: Sumit Saxena
Reviewed-by: Hannes Reinicke
Signed-off-by: Martin K. Petersen