Commit f2818663c82b7297ff4aa38cbddb870dc02f7104

Authored by Hannes Reinecke
Committed by James Bottomley
1 parent e47c11c7a4

[SCSI] scsi_transport_fc: Remove capping from dev_loss_tmo

Currently dev_loss_tmo is capped by SCSI_DEVICE_BLOCK_MAX_TIMEOUT.
This causes problem with multipathing when the 'no_path_retry' setting
exceeds the dev_loss_tmo setting, as then the system might run into
a deadlock when all paths have been removed temporarily for longer
than dev_loss_tmo.
The principal reasons for the capping has been that we should
not allow a remote port to remain in status 'blocked' indefinitely,
so the capping is there to ensure that the port status is being reset
eventually.
However, the fast_io_fail_tmo will also move the remote port out of
the 'blocked' state, so for any HBA driver implementing both the
capping should really be on the fast_io_fail_tmo, and not on the
dev_loss_tmo.
This patch implements just that, ie the fast_io_fail_tmo is capped
to SCSI_DEVICE_BLOCK_TIMEOUT and the capping is removed from
dev_loss_tmo when fast_io_fail_tmo is set.
This allows us to synchronize the dev_loss_tmo setting to the
'no_path_retry' setting from multipathing thus avoiding the deadlock.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Acked-by: James Smart  <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

Showing 1 changed file with 21 additions and 5 deletions Side-by-side Diff

drivers/scsi/scsi_transport_fc.c
... ... @@ -475,7 +475,8 @@
475 475 "Maximum number of seconds that the FC transport should"
476 476 " insulate the loss of a remote port. Once this value is"
477 477 " exceeded, the scsi target is removed. Value should be"
478   - " between 1 and SCSI_DEVICE_BLOCK_MAX_TIMEOUT.");
  478 + " between 1 and SCSI_DEVICE_BLOCK_MAX_TIMEOUT if"
  479 + " fast_io_fail_tmo is not set.");
479 480  
480 481 /*
481 482 * Netlink Infrastructure
482 483  
... ... @@ -842,9 +843,17 @@
842 843 (rport->port_state == FC_PORTSTATE_NOTPRESENT))
843 844 return -EBUSY;
844 845 val = simple_strtoul(buf, &cp, 0);
845   - if ((*cp && (*cp != '\n')) ||
846   - (val < 0) || (val > SCSI_DEVICE_BLOCK_MAX_TIMEOUT))
  846 + if ((*cp && (*cp != '\n')) || (val < 0))
847 847 return -EINVAL;
  848 +
  849 + /*
  850 + * If fast_io_fail is off we have to cap
  851 + * dev_loss_tmo at SCSI_DEVICE_BLOCK_MAX_TIMEOUT
  852 + */
  853 + if (rport->fast_io_fail_tmo == -1 &&
  854 + val > SCSI_DEVICE_BLOCK_MAX_TIMEOUT)
  855 + return -EINVAL;
  856 +
848 857 i->f->set_rport_dev_loss_tmo(rport, val);
849 858 return count;
850 859 }
851 860  
... ... @@ -925,9 +934,16 @@
925 934 rport->fast_io_fail_tmo = -1;
926 935 else {
927 936 val = simple_strtoul(buf, &cp, 0);
928   - if ((*cp && (*cp != '\n')) ||
929   - (val < 0) || (val >= rport->dev_loss_tmo))
  937 + if ((*cp && (*cp != '\n')) || (val < 0))
930 938 return -EINVAL;
  939 + /*
  940 + * Cap fast_io_fail by dev_loss_tmo or
  941 + * SCSI_DEVICE_BLOCK_MAX_TIMEOUT.
  942 + */
  943 + if ((val >= rport->dev_loss_tmo) ||
  944 + (val > SCSI_DEVICE_BLOCK_MAX_TIMEOUT))
  945 + return -EINVAL;
  946 +
931 947 rport->fast_io_fail_tmo = val;
932 948 }
933 949 return count;