Eric Lee / smarc-fsl-linux-kernel

20 Jul, 2012

40 commits

9524c6821 [SCSI] libsas: add sas_eh_abort_handler ... Browse Code »

When recovering failed eh-cmnds let the lldd attempt an abort via
scsi_abort_eh_cmnd before escalating.

Reviewed-by: Jacek Danecki
Signed-off-by: Dan Williams
Signed-off-by: James Bottomley

Dan Williams
2012-07-20 15:58:50 +0800
5db45bdc8 [SCSI] libsas: enforce eh strategy handlers only in eh context ... Browse Code »

The strategy handlers may be called in places that are problematic for
libsas (i.e. sata resets outside of domain revalidation filtering /
libata link recovery), or problematic for userspace (non-blocking ioctl
to sleeping reset functions). However, these routines are also called
for eh escalations and recovery of scsi_eh_prep_cmnd(), so permit them
as long as we are running in the host's error handler, otherwise arrange
for them to be triggered in eh_context.

Signed-off-by: Dan Williams
Signed-off-by: James Bottomley

Dan Williams
2012-07-20 15:58:50 +0800
b9d5c6b7e [SCSI] cleanup setting task state in scsi_error_handler() ... Browse Code »

A quick reading of scsi_error_handler() one could come away with the
impression that it does its wakeup event check while the task state is
TASK_RUNNING. In fact it sets TASK_INTERRUPTIBLE at the bottom of the
loop, but that is ~50 lines down.

Just set TASK_INTERRUPTIBLE at the top of loop and be done.

Signed-off-by: Dan Williams
Signed-off-by: James Bottomley

Dan Williams
2012-07-20 15:58:47 +0800
36fed4980 [SCSI] libsas: cleanup spurious calls to scsi_schedule_eh ... Browse Code »

eh is woken up automatically by the presence of failed commands,
scsi_schedule_eh is reserved for cases where there are no failed
commands. This guarantees that host_eh_sceduled is only incremented
when an explicit eh request is made.

Reviewed-by: Jacek Danecki
Signed-off-by: Maciej Trela
[fixed spurious delete of sas_ata_task_abort]
Signed-off-by: Artur Wojcik
Signed-off-by: Dan Williams
Signed-off-by: James Bottomley

Maciej Trela
2012-07-20 15:58:47 +0800
57fc2e335 [SCSI] fix eh wakeup (scsi_schedule_eh vs scsi_restart_operations) ... Browse Code »

Rapid ata hotplug on a libsas controller results in cases where libsas
is waiting indefinitely on eh to perform an ata probe.

A race exists between scsi_schedule_eh() and scsi_restart_operations()
in the case when scsi_restart_operations() issues i/o to other devices
in the sas domain. When this happens the host state transitions from
SHOST_RECOVERY (set by scsi_schedule_eh) back to SHOST_RUNNING and
->host_busy is non-zero so we put the eh thread to sleep even though
->host_eh_scheduled is active.

Before putting the error handler to sleep we need to check if the
host_state needs to return to SHOST_RECOVERY for another trip through
eh. Since i/o that is released by scsi_restart_operations has been
blocked for at least one eh cycle, this implementation allows those
i/o's to run before another eh cycle starts to discourage hung task
timeouts.

Cc:
Reported-by: Tom Jackson
Tested-by: Tom Jackson
Signed-off-by: Dan Williams
Signed-off-by: James Bottomley

Dan Williams
2012-07-20 15:58:46 +0800
e4a9c3732 [SCSI] libata, libsas: introduce sched_eh and end_eh port ops ... Browse Code »

When managing shost->host_eh_scheduled libata assumes that there is a
1:1 shost-to-ata_port relationship. libsas creates a 1:N relationship
so it needs to manage host_eh_scheduled cumulatively at the host level.
The sched_eh and end_eh port port ops allow libsas to track when domain
devices enter/leave the "eh-pending" state under ha->lock (previously
named ha->state_lock, but it is no longer just a lock for ha->state
changes).

Since host_eh_scheduled indicates eh without backing commands pinning
the device it can be deallocated at any time. Move the taking of the
domain_device reference under the port_lock to guarantee that the
ata_port stays around for the duration of eh.

Reviewed-by: Jacek Danecki
Acked-by: Jeff Garzik
Signed-off-by: Dan Williams
Signed-off-by: James Bottomley

Dan Williams
2012-07-20 15:58:45 +0800
3b661a92e [SCSI] fix hot unplug vs async scan race ... Browse Code »

The following crash results from cases where the end_device has been
removed before scsi_sysfs_add_sdev has had a chance to run.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
IP: [] sysfs_create_dir+0x32/0xb6
...
Call Trace:
[] kobject_add_internal+0x120/0x1e3
[] ? trace_hardirqs_on+0xd/0xf
[] kobject_add_varg+0x41/0x50
[] kobject_add+0x64/0x66
[] device_add+0x12d/0x63a
[] ? _raw_spin_unlock_irqrestore+0x47/0x56
[] ? module_refcount+0x89/0xa0
[] scsi_sysfs_add_sdev+0x4e/0x28a
[] do_scan_async+0x9c/0x145

...teach scsi_sysfs_add_devices() to check for deleted devices() before
trying to add them, and teach scsi_remove_target() how to remove targets
that have not been added via device_add().

Cc:
Reported-by: Dariusz Majchrzak
Signed-off-by: Dan Williams
Signed-off-by: James Bottomley

Dan Williams
2012-07-20 15:58:45 +0800
b5f1758f2 [SCSI] aacraid: Fix endian issues in core and SRC portions of driver ... Browse Code »

This may not fix all endian issues in this driver, but it does get the
driver working on PowerPC for a PMC SRC card. So it should at least fix
all the problems in the core and in the SRC support.

[jejb: fix >> 32 breakage reported by Fengguang Wu]
Signed-off-by: Ben Collins
Acked-by: Achim Leubner
Signed-off-by: James Bottomley

Ben Collins
2012-07-20 15:58:44 +0800
30002f1c0 [SCSI] aacraid: Relax the tight timeout loop on fib commands ... Browse Code »

The loop that waited for syncronous fib commands was causing a CPU stall
when a timeout actually occured.

1) Switch to using a more accurate timeout mechanism.
2) Do not pace the loop with udelay(). Use cpu_relax() to allow for
scheduling to occur.

Signed-off-by: Ben Collins
Acked-by: Achim Leubner
Signed-off-by: James Bottomley

Ben Collins
2012-07-20 15:58:44 +0800
361ee9c3f [SCSI] aacraid: Better handling of in-flight events on thread stop ... Browse Code »

When an error occured that would shut down the driver, some in-flight
events were getting caught up, deadlocking a CPU or two.

Signed-off-by: Ben Collins
Acked-by: Achim Leubner
Signed-off-by: James Bottomley

Ben Collins
2012-07-20 15:58:43 +0800
ff08784b4 [SCSI] aacraid: Use resource_size_t for IO mem pointers and offsets ... Browse Code »

This also stops using the "legacy crap" in Scsi_Host (shost->base is an
unsigned long).

This affected 32-bit systems that have 64-bit resource sizes, causing the
IO address to be truncated.

Signed-off-by: Ben Collins
Acked-by: Achim Leubner
Signed-off-by: James Bottomley

Ben Collins
2012-07-20 15:58:43 +0800
7e8a74b17 [SCSI] scsi_dh: add scsi_dh_attached_handler_name ... Browse Code »
43

Introduce scsi_dh_attached_handler_name() to retrieve the name of the
scsi_dh that is attached to the scsi_device associated with the provided
request queue. Returns NULL if a scsi_dh is not attached.

Also, fix scsi_dh_{attach,detach} function header comments to document
@q rather than @sdev.

Signed-off-by: Mike Snitzer
Tested-by: Babu Moger
Reviewed-by: Chandra Seetharaman
Acked-by: Hannes Reinecke
Signed-off-by: James Bottomley

Mike Snitzer
2012-07-20 15:58:42 +0800
6aca4112f [SCSI] cxgb4i: tcp push bit fix ... Browse Code »

Fixed the parentheses so the tcp push bit would be sent properly.

Signed-off-by: Karen Xie
Reviewed-by: Mike Christie
Signed-off-by: James Bottomley

Karen Xie
2012-07-20 15:58:42 +0800
b485462ac [SCSI] Stop accepting SCSI requests before removing a device ... Browse Code »

Avoid that the code for requeueing SCSI requests triggers a
crash by making sure that that code isn't scheduled anymore
after a device has been removed.

Also, source code inspection of __scsi_remove_device() revealed
a race condition in this function: no new SCSI requests must be
accepted for a SCSI device after device removal started.

Signed-off-by: Bart Van Assche
Reviewed-by: Mike Christie
Acked-by: Tejun Heo
Signed-off-by: James Bottomley

Bart Van Assche
2012-07-20 15:58:41 +0800
84feb1664 [SCSI] Change return type of scsi_queue_insert() into void ... Browse Code »

The return value of scsi_queue_insert() is ignored by all its
callers, hence change the return type of this function into
void.

Signed-off-by: Bart Van Assche
Reviewed-by: Mike Christie
Reviewed-by: Tejun Heo
Signed-off-by: James Bottomley

Bart Van Assche
2012-07-20 15:58:41 +0800
940f5d47e [SCSI] Avoid dangling pointer in scsi_requeue_command() ... Browse Code »

When we call scsi_unprep_request() the command associated with the request
gets destroyed and therefore drops its reference on the device. If this was
the only reference, the device may get released and we end up with a NULL
pointer deref when we call blk_requeue_request.

Reported-by: Mike Christie
Signed-off-by: Bart Van Assche
Reviewed-by: Mike Christie
Reviewed-by: Tejun Heo
Cc:
[jejb: enhance commend and add commit log for stable]
Signed-off-by: James Bottomley

Bart Van Assche
2012-07-20 15:58:40 +0800
67bd94130 [SCSI] Fix device removal NULL pointer dereference ... Browse Code »

Use blk_queue_dead() to test whether the queue is dead instead
of !sdev. Since scsi_prep_fn() may be invoked concurrently with
__scsi_remove_device(), keep the queuedata (sdev) pointer in
__scsi_remove_device(). This patch fixes a kernel oops that
can be triggered by USB device removal. See also
http://www.spinics.net/lists/linux-scsi/msg56254.html.

Other changes included in this patch:
- Swap the blk_cleanup_queue() and kfree() calls in
scsi_host_dev_release() to make that code easier to grasp.
- Remove the queue dead check from scsi_run_queue() since the
queue state can change anyway at any point in that function
where the queue lock is not held.
- Remove the queue dead check from the start of scsi_request_fn()
since it is redundant with the scsi_device_online() check.

Reported-by: Jun'ichi Nomura
Signed-off-by: Bart Van Assche
Reviewed-by: Mike Christie
Reviewed-by: Tejun Heo
Cc:
Signed-off-by: James Bottomley

Bart Van Assche
2012-07-20 15:58:40 +0800
e81ca6fe8 [SCSI] block: Fix blk_execute_rq_nowait() dead queue handling ... Browse Code »

If the queue is dead blk_execute_rq_nowait() doesn't invoke the done()
callback function. That will result in blk_execute_rq() being stuck
in wait_for_completion(). Avoid this by initializing rq->end_io to the
done() callback before we check the queue state. Also, make sure the
queue lock is held around the invocation of the done() callback. Found
this through source code review.

Signed-off-by: Muthukumar Ratty
Signed-off-by: Bart Van Assche
Reviewed-by: Tejun Heo
Acked-by: Jens Axboe
Signed-off-by: James Bottomley

Muthukumar Ratty
2012-07-20 15:58:39 +0800
6548b0e5b [SCSI] megaraid: remove a spurious IRQ enable ... Browse Code »

We took this lock with spin_lock() so we should unlock it with
spin_unlock() instead of spin_unlock_irq(). This was introduced in
f2c8dc402b "[SCSI] megaraid_mbox: remove scsi_assign_lock usage".

Signed-off-by: Dan Carpenter
Acked-by: Adam Radford
Signed-off-by: James Bottomley

Dan Carpenter
2012-07-20 15:58:39 +0800
9d5d93e32 [SCSI] megaraid: cleanup type issue in mega_build_cmd() ... Browse Code »

On 64 bit systems the current code sets 32 bits of "seg" and leaves the
other 32 uninitialized. It doesn't matter since the variable is never
used. But it's still messy and we should fix it.

Signed-off-by: Dan Carpenter
Acked-by: Adam Radford
Signed-off-by: James Bottomley

Dan Carpenter
2012-07-20 15:58:38 +0800
a5254dbb1 [SCSI] bfa: dereferencing freed memory in bfad_im_probe() ... Browse Code »

If bfad_thread_workq(bfad) was not BFA_STATUS_OK then we freed "im"
and then dereferenced it.

I did a little clean up because it seemed nicer to return directly
instead of doing a superfluous goto. I looked at other functions in
this file and it seems like returning directly is standard.

Signed-off-by: Dan Carpenter
Acked-by: Krishna Gudipati
Signed-off-by: James Bottomley

Dan Carpenter
2012-07-20 15:58:37 +0800
fffa69230 [SCSI] bfa: off by one in bfa_ioc_mbox_isr() ... Browse Code »

If mc == BFI_MC_MAX then we're reading past the end of the
mod->mbhdlr[] array.

Signed-off-by: Dan Carpenter
Acked-by: Krishna Gudipati
Signed-off-by: James Bottomley

Dan Carpenter
2012-07-20 15:58:37 +0800
9e1a15376 [SCSI] properly initialize atomic_t ... Browse Code »

Initialize atomic_t scsi_host_next_hn and ioerr_cntas per the guidelines
defined in Documentation/atomic_ops.txt

Signed-off-by: Josh Hunt
Signed-off-by: James Bottomley

Josh Hunt
2012-07-20 15:58:36 +0800
bb2c94a3a [SCSI] scsi_dh_alua: Re-enable STPG for unavailable ports ... Browse Code »

A quote from SPC-4: "While in the unavailable primary target port
asymmetric access state, the device server shall support those of
the following commands that it supports while in the active/optimized
state: [ ... ] d) SET TARGET PORT GROUPS; [ ... ]". Hence re-enable
sending STPG to a target port group that is in the unavailable state.

Signed-off-by: Bart Van Assche
Reviewed-by: Babu Moger
Signed-off-by: James Bottomley

Bart Van Assche
2012-07-20 15:58:36 +0800
efb6c717b [SCSI] qla4xxx: Update driver version to 5.02.00-k18 ... Browse Code »

Signed-off-by: Vikas Chaudhary
Signed-off-by: James Bottomley

Vikas Chaudhary
2012-07-20 15:58:35 +0800
18e2df938 [SCSI] qla4xxx: Fix Spell check. ... Browse Code »

Signed-off-by: Vikas Chaudhary
Signed-off-by: James Bottomley

Vikas Chaudhary
2012-07-20 15:58:35 +0800
68b6d5d3d [SCSI] qla4xxx: Fix a Sparse warning message ... Browse Code »

Fix following message:-
drivers/scsi/qla4xxx/ql4_os.c:3266:5: error: symbol 'qla4xxx_post_aen_work' redeclared with different type (originally declared at drivers/scsi/qla4xxx/ql4_glbl.h:186) - incompatible argument 2 (different signedness)

Signed-off-by: Vikas Chaudhary
Signed-off-by: James Bottomley

Vikas Chaudhary
2012-07-20 15:58:34 +0800
1cb78d73d [SCSI] qla4xxx: multi-session fix for flash ddbs ... Browse Code »

Allow multi-session to target (for flash ddbs) accesible via
multiple network portal

Signed-off-by: Vikas Chaudhary
Signed-off-by: James Bottomley

Vikas Chaudhary
2012-07-20 15:58:34 +0800
bc97f4bb4 [SCSI] scsi_dh_alua: backoff alua rtpg retry linearly vs. geometrically ... Browse Code »

Currently the backoff algorithm for when to retry alua rtpg
requests progresses geometrically as so:

2, 4, 8, 16, 32, 64... seconds.

This progression can lead to un-needed delay in retrying
alua rtpg requests when the rtpgs are delayed. A less
aggressive backoff algorithm that is additive would not
lead to such large jumps when delays start getting long, but
would backoff linearly:

2, 4, 6, 8, 10... seconds.

Signed-off-by: Martin George
Signed-off-by: Rob Evers
Reviewed-by: Babu Moger
Signed-off-by: James Bottomley

Rob Evers
2012-07-20 15:58:33 +0800
8e67ce607 [SCSI] scsi_dh_alua: retry alua rtpg extended header for illegal request response ... Browse Code »

Some storage arrays are known to return 'illegal request'
when an rtpg extended header request is made. T10 says the
array should ignore the bit, and return the non-extended
rtpg as the array doesn't support the request. Working
around this by retrying the rtpg request without the extended
header bit set when the extended rtpg request results in
illegal request.

Signed-off-by: Rob Evers
Reviewed-by: Babu Moger
Signed-off-by: James Bottomley

Rob Evers
2012-07-20 15:58:33 +0800
3588c5a21 [SCSI] scsi_dh_alua: implement 'implied transition timeout' ... Browse Code »

During alua transitions, an array can return transitioning
status in response to rtpg requests. These requests get
retried for a maximum of 60 seconds by default before timing
out. Sometimes this timeout isn't sufficient to allow the
array to complete the transition. T10-spc4 addresses this
under 'Report Target Port Groups' command.

This update retrieves the timeout value from the storage
array if available and retries the transitioning rtpgs
for up to the 'implied transitioning timeout' value

Signed-off-by: Rob Evers
Reviewed-by: Babu Moger
Signed-off-by: James Bottomley

Rob Evers
2012-07-20 15:58:32 +0800
6ad819b06 [SCSI] arcmsr: fix misuse of | instead of & ... Browse Code »

ARCMSR_ARC1880_DiagWrite_ENABLE is 0x00000080 so (x | 0x00000080) is
never zero. The intent here was to test that loop until
ARCMSR_ARC1880_DiagWrite_ENABLE was turned on, but because the test was
wrong, we would do five loops regardless of whether it succeed or not.

Also I simplified the condition a little by removing the unused
assignement.

Signed-off-by: Dan Carpenter
Acked-by: Nick Cheng
Signed-off-by: James Bottomley

Dan Carpenter
2012-07-20 15:58:31 +0800
23f0bb47a [SCSI] hptiop: fix RR312x in hosts with >12GB ... Browse Code »

As the limitation of RR312x's dma engine, the HBA can not access host memory
over 12GB. This fixes

https://bugzilla.kernel.org/show_bug.cgi?id=14311

[alan: resurrected bug from 2009 and pushed upstream]
Reported-by: Alan Cox
Signed-off-by: HighPoint Linux Team
Signed-off-by: James Bottomley

HighPoint Linux Team
2012-07-20 15:58:30 +0800
f3d8af9e2 [SCSI] lpfc 8.3.32: Update lpfc to version 8.3.32 ... Browse Code »

Signed-off-by: Alex Iannicelli
Signed-off-by: James Smart
Signed-off-by: James Bottomley

James Smart
2012-07-20 15:58:30 +0800
4b8bae08b [SCSI] lpfc 8.3.32: Fix error reporting of misconfigured ports ... Browse Code »

Signed-off-by: Alex Iannicelli
Signed-off-by: James Smart
Signed-off-by: James Bottomley

James Smart
2012-07-20 15:58:30 +0800
6b415f5d6 [SCSI] lpfc 8.3.32: Fix system panic due to node state change ... Browse Code »

Fix System Panic During IO Test using Medusa tool

Signed-off-by: Alex Iannicelli
Signed-off-by: James Smart
Signed-off-by: James Bottomley

James Smart
2012-07-20 15:58:29 +0800
173edbb2c [SCSI] lpfc 8.3.32: Fix ability to change FCP EQ delay multiplier ... Browse Code »

Fix fcp_imax module parameter to dynamically change FCP EQ delay multiplier

Signed-off-by: Alex Iannicelli
Signed-off-by: James Smart
Signed-off-by: James Bottomley

James Smart
2012-07-20 15:58:29 +0800
3a70730aa [SCSI] lpfc 8.3.32: Correct successful aborts returning error status ... Browse Code »

Signed-off-by: Alex Iannicelli
Signed-off-by: James Smart
Signed-off-by: James Bottomley

James Smart
2012-07-20 15:58:28 +0800
618a5230b [SCSI] lpfc 8.3.32: Correct provisioning change failure on local function ... Browse Code »

Fixed system held-up when performing resource provsion through same PCI
function

Signed-off-by: Alex Iannicelli
Signed-off-by: James Smart
Signed-off-by: James Bottomley

James Smart
2012-07-20 15:58:28 +0800
bbeb79b90 [SCSI] lpfc 8.3.32: Correct host DIF configuration that hung system ... Browse Code »

Fix system hang due to bad protection module parameters (CR: 130769)

Signed-off-by: Alex Iannicelli
Signed-off-by: James Smart
Signed-off-by: James Bottomley

James Smart
2012-07-20 15:58:27 +0800