Doug / smarc-fsl-linux-kernel | Embedian Git Server

23 Sep, 2013

1 commit

f47908269 dm: add reserved_rq_based_ios module parameter ... Browse Code »

Allow user to change the number of IOs that are reserved by
request-based DM's mempools by writing to this file:
/sys/module/dm_mod/parameters/reserved_rq_based_ios

The default value is RESERVED_REQUEST_BASED_IOS (256). The maximum
allowed value is RESERVED_MAX_IOS (1024).

Export dm_get_reserved_rq_based_ios() for use by DM targets and core
code. Switch to sizing dm-mpath's mempool using DM core's configurable
'reserved_rq_based_ios'.

Signed-off-by: Mike Snitzer
Signed-off-by: Frank Mayhar
Acked-by: Mikulas Patocka

Mike Snitzer
2013-09-23 22:42:24 +0800

20 Sep, 2013

1 commit

f84cb8a46 dm mpath: disable WRITE SAME if it fails ... Browse Code »

Workaround the SCSI layer's problematic WRITE SAME heuristics by
disabling WRITE SAME in the DM multipath device's queue_limits if an
underlying device disabled it.

The WRITE SAME heuristics, with both the original commit 5db44863b6eb
("[SCSI] sd: Implement support for WRITE SAME") and the updated commit
66c28f971 ("[SCSI] sd: Update WRITE SAME heuristics"), default to enabling
WRITE SAME(10) even without successfully determining it is supported.
After the first failed WRITE SAME the SCSI layer will disable WRITE SAME
for the device (by setting sdkp->device->no_write_same which results in
'max_write_same_sectors' in device's queue_limits to be set to 0).

When a device is stacked ontop of such a SCSI device any changes to that
SCSI device's queue_limits do not automatically propagate up the stack.
As such, a DM multipath device will not have its WRITE SAME support
disabled. This causes the block layer to continue to issue WRITE SAME
requests to the mpath device which causes paths to fail and (if mpath IO
isn't configured to queue when no paths are available) it will result in
actual IO errors to the upper layers.

This fix doesn't help configurations that have additional devices
stacked ontop of the mpath device (e.g. LVM created linear DM devices
ontop). A proper fix that restacks all the queue_limits from the bottom
of the device stack up will need to be explored if SCSI will continue to
use this model of optimistically allowing op codes and then disabling
them after they fail for the first time.

Before this patch:

EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null)
device-mapper: multipath: XXX snitm debugging: got -EREMOTEIO (-121)
device-mapper: multipath: XXX snitm debugging: failing WRITE SAME IO with error=-121
end_request: critical target error, dev dm-6, sector 528
dm-6: WRITE SAME failed. Manually zeroing.
device-mapper: multipath: Failing path 8:112.
end_request: I/O error, dev dm-6, sector 4616
dm-6: WRITE SAME failed. Manually zeroing.
end_request: I/O error, dev dm-6, sector 4616
end_request: I/O error, dev dm-6, sector 5640
end_request: I/O error, dev dm-6, sector 6664
end_request: I/O error, dev dm-6, sector 7688
end_request: I/O error, dev dm-6, sector 524288
Buffer I/O error on device dm-6, logical block 65536
lost page write due to I/O error on dm-6
JBD2: Error -5 detected when updating journal superblock for dm-6-8.
end_request: I/O error, dev dm-6, sector 524296
Aborting journal on device dm-6-8.
end_request: I/O error, dev dm-6, sector 524288
Buffer I/O error on device dm-6, logical block 65536
lost page write due to I/O error on dm-6
JBD2: Error -5 detected when updating journal superblock for dm-6-8.

# cat /sys/block/sdh/queue/write_same_max_bytes
0
# cat /sys/block/dm-6/queue/write_same_max_bytes
33553920

After this patch:

EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null)
device-mapper: multipath: XXX snitm debugging: got -EREMOTEIO (-121)
device-mapper: multipath: XXX snitm debugging: WRITE SAME I/O failed with error=-121
end_request: critical target error, dev dm-6, sector 528
dm-6: WRITE SAME failed. Manually zeroing.

# cat /sys/block/sdh/queue/write_same_max_bytes
0
# cat /sys/block/dm-6/queue/write_same_max_bytes
0

It should be noted that WRITE SAME support wasn't enabled in DM
multipath until v3.10.

Signed-off-by: Mike Snitzer
Cc: Martin K. Petersen
Cc: Hannes Reinecke
Cc: stable@vger.kernel.org # 3.10+

Mike Snitzer
2013-09-20 22:36:34 +0800

19 Sep, 2013

1 commit

cc9d3c382 dm mpath: do not fail path on -ENOSPC ... Browse Code »

Since ENOSPC is a target-side error, dm-mpath should just pass the error
information to upper layer instead of retrying itself with path failover.
Otherwise it will end up failing all paths down while path checkers find
all paths ok.

ENOSPC can now be returned from SCSI device after commit a9d6ceb8
("[SCSI] return ENOSPC on thin provisioning failure").

Signed-off-by: Jun'ichi Nomura
Acked-by: Hannes Reinecke
Signed-off-by: Mike Snitzer

Jun'ichi Nomura
2013-09-19 02:41:06 +0800

24 Aug, 2013

1 commit

7e782af57 [SCSI] Return ENODATA on medium error ... Browse Code »

When a medium error is detected the SCSI stack should return
ENODATA to the upper layers.

[jejb: fix whitespace error]
Signed-off-by: Hannes Reinecke
Signed-off-by: James Bottomley

Hannes Reinecke
2013-08-24 00:54:53 +0800

11 Jul, 2013

1 commit

6c182cd88 dm mpath: fix ioctl deadlock when no paths ... Browse Code »

When multipath needs to retry an ioctl the reference to the
current live table needs to be dropped. Otherwise a deadlock
occurs when all paths are down:
- dm_blk_ioctl takes a reference to the current table
and spins in multipath_ioctl().
- A new table is being loaded, but upon resume the process
hangs in dm_table_destroy() waiting for references to
drop to zero.

With this patch the reference to the old table is dropped
prior to retry, thereby avoiding the deadlock.

Signed-off-by: Hannes Reinecke
Cc: Mike Snitzer
Cc: stable@vger.kernel.org
Signed-off-by: Alasdair G Kergon

Hannes Reinecke
2013-07-11 06:41:15 +0800

10 May, 2013

1 commit

042bcef88 dm mpath: enable WRITE SAME support ... Browse Code »

Enable WRITE SAME support in dm multipath. As far as multipath is
concerned it is just another write request.

Signed-off-by: Mike Snitzer
Tested-by: Bharata B Rao
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2013-05-10 21:37:16 +0800

02 Mar, 2013

2 commits

55a62eef8 dm: rename request variables to bios ... Browse Code »

Use 'bio' in the name of variables and functions that deal with
bios rather than 'request' to avoid confusion with the normal
block layer use of 'request'.

No functional changes.

Signed-off-by: Alasdair G Kergon

Alasdair G Kergon
2013-03-02 06:45:47 +0800
fd7c092e7 dm: fix truncated status strings ... Browse Code »

Avoid returning a truncated table or status string instead of setting
the DM_BUFFER_FULL_FLAG when the last target of a table fills the
buffer.

When processing a table or status request, the function retrieve_status
calls ti->type->status. If ti->type->status returns non-zero,
retrieve_status assumes that the buffer overflowed and sets
DM_BUFFER_FULL_FLAG.

However, targets don't return non-zero values from their status method
on overflow. Most targets returns always zero.

If a buffer overflow happens in a target that is not the last in the
table, it gets noticed during the next iteration of the loop in
retrieve_status; but if a buffer overflow happens in the last target, it
goes unnoticed and erroneously truncated data is returned.

In the current code, the targets behave in the following way:
* dm-crypt returns -ENOMEM if there is not enough space to store the
key, but it returns 0 on all other overflows.
* dm-thin returns errors from the status method if a disk error happened.
This is incorrect because retrieve_status doesn't check the error
code, it assumes that all non-zero values mean buffer overflow.
* all the other targets always return 0.

This patch changes the ti->type->status function to return void (because
most targets don't use the return code). Overflow is detected in
retrieve_status: if the status method fills up the remaining space
completely, it is assumed that buffer overflow happened.

Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2013-03-02 06:45:44 +0800

12 Oct, 2012

1 commit

a71a261f5 dm mpath: fix check for null mpio in end_io fn ... Browse Code »

The mpio dereference should be moved below the BUG_ON NULL test
in multipath_end_io().

spatch with a semantic match was used to found this.
(http://coccinelle.lip6.fr/)

Signed-off-by: Wei Yongjun
Signed-off-by: Alasdair G Kergon

Wei Yongjun
2012-10-12 23:59:42 +0800

03 Oct, 2012

1 commit

033d9959e Merge branch 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq ... Browse Code »

Pull workqueue changes from Tejun Heo:
"This is workqueue updates for v3.7-rc1. A lot of activities this
round including considerable API and behavior cleanups.

* delayed_work combines a timer and a work item. The handling of the
timer part has always been a bit clunky leading to confusing
cancelation API with weird corner-case behaviors. delayed_work is
updated to use new IRQ safe timer and cancelation now works as
expected.

* Another deficiency of delayed_work was lack of the counterpart of
mod_timer() which led to cancel+queue combinations or open-coded
timer+work usages. mod_delayed_work[_on]() are added.

These two delayed_work changes make delayed_work provide interface
and behave like timer which is executed with process context.

* A work item could be executed concurrently on multiple CPUs, which
is rather unintuitive and made flush_work() behavior confusing and
half-broken under certain circumstances. This problem doesn't
exist for non-reentrant workqueues. While non-reentrancy check
isn't free, the overhead is incurred only when a work item bounces
across different CPUs and even in simulated pathological scenario
the overhead isn't too high.

All workqueues are made non-reentrant. This removes the
distinction between flush_[delayed_]work() and
flush_[delayed_]_work_sync(). The former is now as strong as the
latter and the specified work item is guaranteed to have finished
execution of any previous queueing on return.

* In addition to the various bug fixes, Lai redid and simplified CPU
hotplug handling significantly.

* Joonsoo introduced system_highpri_wq and used it during CPU
hotplug.

There are two merge commits - one to pull in IRQ safe timer from
tip/timers/core and the other to pull in CPU hotplug fixes from
wq/for-3.6-fixes as Lai's hotplug restructuring depended on them."

Fixed a number of trivial conflicts, but the more interesting conflicts
were silent ones where the deprecated interfaces had been used by new
code in the merge window, and thus didn't cause any real data conflicts.

Tejun pointed out a few of them, I fixed a couple more.

* 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (46 commits)
workqueue: remove spurious WARN_ON_ONCE(in_irq()) from try_to_grab_pending()
workqueue: use cwq_set_max_active() helper for workqueue_set_max_active()
workqueue: introduce cwq_set_max_active() helper for thaw_workqueues()
workqueue: remove @delayed from cwq_dec_nr_in_flight()
workqueue: fix possible stall on try_to_grab_pending() of a delayed work item
workqueue: use hotcpu_notifier() for workqueue_cpu_down_callback()
workqueue: use __cpuinit instead of __devinit for cpu callbacks
workqueue: rename manager_mutex to assoc_mutex
workqueue: WORKER_REBIND is no longer necessary for idle rebinding
workqueue: WORKER_REBIND is no longer necessary for busy rebinding
workqueue: reimplement idle worker rebinding
workqueue: deprecate __cancel_delayed_work()
workqueue: reimplement cancel_delayed_work() using try_to_grab_pending()
workqueue: use mod_delayed_work() instead of __cancel + queue
workqueue: use irqsafe timer for delayed_work
workqueue: clean up delayed_work initializers and add missing one
workqueue: make deferrable delayed_work initializer names consistent
workqueue: cosmetic whitespace updates for macro definitions
workqueue: deprecate system_nrt[_freezable]_wq
workqueue: deprecate flush[_delayed]_work_sync()
...

Linus Torvalds
2012-10-03 00:54:49 +0800

27 Sep, 2012

1 commit

7ba10aa6f dm mpath: only retry ioctl when no paths if queue_if_no_path set ... Browse Code »

When there are no paths and multipath receives an ioctl, it waits until
a path becomes available. This behaviour is incorrect if the
"queue_if_no_path" setting was not specified, as then the ioctl should
be rejected immediately, which this patch now does.

commit 35991652b ("dm mpath: allow ioctls to trigger pg init") should
have checked if queue_if_no_path was configured before queueing IO.

Checking for the queue_if_no_path feature, like is done in map_io(),
allows the following table load to work without blocking in the
multipath_ioctl retry loop:

echo "0 1024 multipath 0 0 0 0" | dmsetup create mpath_nodevs

Without this fix the multipath_ioctl will block with the following stack
trace:

blkid D 0000000000000002 0 23936 1 0x00000000
ffff8802b89e5cd8 0000000000000082 ffff8802b89e5fd8 0000000000012440
ffff8802b89e4010 0000000000012440 0000000000012440 0000000000012440
ffff8802b89e5fd8 0000000000012440 ffff88030c2aab30 ffff880325794040
Call Trace:
[] schedule+0x29/0x70
[] schedule_timeout+0x182/0x2e0
[] ? lock_timer_base+0x70/0x70
[] schedule_timeout_uninterruptible+0x1e/0x20
[] msleep+0x20/0x30
[] multipath_ioctl+0x109/0x170 [dm_multipath]
[] dm_blk_ioctl+0xbc/0xd0 [dm_mod]
[] __blkdev_driver_ioctl+0x28/0x30
[] blkdev_ioctl+0xce/0x730
[] block_ioctl+0x3c/0x40
[] do_vfs_ioctl+0x8c/0x340
[] ? sys_newfstat+0x33/0x40
[] sys_ioctl+0xa1/0xb0
[] system_call_fastpath+0x16/0x1b

Signed-off-by: Mike Snitzer
Cc: stable@vger.kernel.org # 3.5+
Acked-by: Mikulas Patocka
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2012-09-27 06:45:41 +0800

21 Aug, 2012

1 commit

43829731d workqueue: deprecate flush[_delayed]_work_sync() ... Browse Code »

flush[_delayed]_work_sync() are now spurious. Mark them deprecated
and convert all users to flush[_delayed]_work().

If you're cc'd and wondering what's going on: Now all workqueues are
non-reentrant and the regular flushes guarantee that the work item is
not pending or running on any CPU on return, so there's no reason to
use the sync flushes at all and they're going away.

This patch doesn't make any functional difference.

Signed-off-by: Tejun Heo
Cc: Russell King
Cc: Paul Mundt
Cc: Ian Campbell
Cc: Jens Axboe
Cc: Mattia Dongili
Cc: Kent Yoder
Cc: David Airlie
Cc: Jiri Kosina
Cc: Karsten Keil
Cc: Bryan Wu
Cc: Benjamin Herrenschmidt
Cc: Alasdair Kergon
Cc: Mauro Carvalho Chehab
Cc: Florian Tobias Schandinat
Cc: David Woodhouse
Cc: "David S. Miller"
Cc: linux-wireless@vger.kernel.org
Cc: Anton Vorontsov
Cc: Sangbeom Kim
Cc: "James E.J. Bottomley"
Cc: Greg Kroah-Hartman
Cc: Eric Van Hensbergen
Cc: Takashi Iwai
Cc: Steven Whitehouse
Cc: Petr Vandrovec
Cc: Mark Fasheh
Cc: Christoph Hellwig
Cc: Avi Kivity

Tejun Heo
2012-08-21 05:51:24 +0800

27 Jul, 2012

2 commits

1f4e0ff07 dm thin: commit before gathering status ... Browse Code »

Commit outstanding metadata before returning the status for a dm thin
pool so that the numbers reported are as up-to-date as possible.

The commit is not performed if the device is suspended or if
the DM_NOFLUSH_FLAG is supplied by userspace and passed to the target
through a new 'status_flags' parameter in the target's dm_status_fn.

The userspace dmsetup tool will support the --noflush flag with the
'dmsetup status' and 'dmsetup wait' commands from version 1.02.76
onwards.

Tested-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Alasdair G Kergon
2012-07-27 22:08:16 +0800
a58a935d5 dm mpath: add retain_attached_hw_handler feature ... Browse Code »

A SCSI device handler might get attached to a device during the
initial device scan. We do not necessarily want to override
this when loading a multipath table, so this patch adds a new
multipath feature argument "retain_attached_hw_handler".

During SCSI device scan all loaded SCSI device handlers will be
consulted for a match (via scsi_dh's provided .match). If a match is
found that device handler will be attached. We need a way to have
userspace multipathd's provided 'hw_handler' not override the already
attached hardware handler.

When specifying the new feature 'retain_attached_hw_handler' multipath
will use the currently attached hardware handler instead of trying to
attach the one specified during table load. If no hardware handler is
attached the specified hardware handler will still be used.

Leverages scsi_dh_attach's ability to increment the scsi_dh's reference
count if the same scsi_dh name is provided when attaching - currently
attached scsi_dh name is determined with scsi_dh_attached_handler_name.

Depends upon commit 7e8a74b177f17d100916b6ad415450f7c9508691
("[SCSI] scsi_dh: add scsi_dh_attached_handler_name").

Signed-off-by: Mike Snitzer
Tested-by: Babu Moger
Reviewed-by: Chandra Seetharaman
Acked-by: Hannes Reinecke
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2012-07-27 22:08:04 +0800

03 Jun, 2012

3 commits

35991652b dm mpath: allow ioctls to trigger pg init ... Browse Code »

After the failure of a group of paths, any alternative paths that
need initialising do not become available until further I/O is sent to
the device. Until this has happened, ioctls return -EAGAIN.

With this patch, new paths are made available in response to an ioctl
too. The processing of the ioctl gets delayed until this has happened.

Instead of returning an error, we submit a work item to kmultipathd
(that will potentially activate the new path) and retry in ten
milliseconds.

Note that the patch doesn't retry an ioctl if the ioctl itself fails due
to a path failure. Such retries should be handled intelligently by the
code that generated the ioctl in the first place, noting that some SCSI
commands should not be retried because they are not idempotent (XOR write
commands). For commands that could be retried, there is a danger that
if the device rejected the SCSI command, the path could be errorneously
marked as failed, and the request would be retried on another path which
might fail too. It can be determined if the failure happens on the
device or on the SCSI controller, but there is no guarantee that all
SCSI drivers set these flags correctly.

Signed-off-by: Mikulas Patocka
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2012-06-03 07:29:58 +0800
f220fd4ef dm mpath: delay retry of bypassed pg ... Browse Code »

If I/O needs retrying and only bypassed priority groups are available,
set the pg_init_delay_retry flag to wait before retrying.

If, for example, the reason for the bypass is that the controller is
getting reset or there is a firmware upgrade happening, retrying right
away would cause a flood of log messages and retries for what could be a
few seconds or even several minutes.

Signed-off-by: Mike Christie
Acked-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mike Christie
2012-06-03 07:29:45 +0800
1fbdd2b3a dm mpath: reduce size of struct multipath ... Browse Code »

Move multipath structure's 'lock' and 'queue_size' members to eliminate
two 4-byte holes. Also use a bit within a single unsigned int for each
existing flag (saves 8-bytes). This allows future flags to be added
without each consuming an unsigned int.

Signed-off-by: Mike Snitzer
Acked-by: Hannes Reinecke
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2012-06-03 07:29:43 +0800

12 May, 2012

1 commit

510193a2d dm mpath: check if scsi_dh module already loaded before trying to load ... Browse Code »

If the requested scsi_dh module is already loaded then skip
request_module().

Multipath table loads can hang in an unnecessary __request_module.

Reported-by: Ben Marzinski
Cc: stable@kernel.org
Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2012-05-12 08:43:21 +0800

29 Mar, 2012

2 commits

31998ef19 dm: reject trailing characters in sccanf input ... Browse Code »

Device mapper uses sscanf to convert arguments to numbers. The problem is that
the way we use it ignores additional unmatched characters in the scanned string.

For example, this `if (sscanf(string, "%d", &number) == 1)' will match a number,
but also it will match number with some garbage appended, like "123abc".

As a result, device mapper accepts garbage after some numbers. For example
the command `dmsetup create vg1-new --table "0 16384 linear 254:1bla 34816bla"'
will pass without an error.

This patch fixes all sscanf uses in device mapper. It appends "%c" with
a pointer to a dummy character variable to every sscanf statement.

The construct `if (sscanf(string, "%d%c", &number, &dummy) == 1)' succeeds
only if string is a null-terminated number (optionally preceded by some
whitespace characters). If there is some character appended after the number,
sscanf matches "%c", writes the character to the dummy variable and returns 2.
We check the return value for 1 and consequently reject numbers with some
garbage appended.

Signed-off-by: Mikulas Patocka
Acked-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mikulas Patocka
2012-03-29 01:41:26 +0800
466891f99 dm mpath: detect invalid map_context ... Browse Code »

The map_context pointer should always be set. However, we have reports
that upon requeuing it is not set correctly. So add set and clear
functions with a BUG_ON() to track the issue properly.

Signed-off-by: Jun'ichi Nomura
Cc: Mike Snitzer
Acked-by: Hannes Reinecke
Tested-by: Heiko Carstens
Acked-by: Dave Wysochanski
Signed-off-by: Alasdair G Kergon

Jun'ichi Nomura
2012-03-29 01:41:25 +0800

15 Jan, 2012

1 commit

ec8013bed dm: do not forward ioctls from logical volumes to the underlying device ... Browse Code »

A logical volume can map to just part of underlying physical volume.
In this case, it must be treated like a partition.

Based on a patch from Alasdair G Kergon.

Cc: Alasdair G Kergon
Cc: dm-devel@redhat.com
Signed-off-by: Paolo Bonzini
Signed-off-by: Linus Torvalds

Paolo Bonzini
2012-01-15 07:07:24 +0800

02 Aug, 2011

2 commits

498f0103e dm table: share target argument parsing functions ... Browse Code »

Move multipath target argument parsing code into dm-table so other
targets can share it.

Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2011-08-02 19:32:04 +0800
286f367da dm mpath: fix potential NULL pointer in feature arg processing ... Browse Code »

Avoid dereferencing a NULL pointer if the number of feature arguments
supplied is fewer than indicated.

Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon
Cc: stable@kernel.org

Mike Snitzer
2011-08-02 19:32:00 +0800

27 Jul, 2011

1 commit

60063497a atomic: use <linux/atomic.h> ... Browse Code »

This allows us to move duplicated code in
(atomic_inc_not_zero() for now) to

Signed-off-by: Arun Sharma
Reviewed-by: Eric Dumazet
Cc: Ingo Molnar
Cc: David Miller
Cc: Eric Dumazet
Acked-by: Mike Frysinger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arun Sharma
2011-07-27 07:49:47 +0800

29 May, 2011

1 commit

6f13f6fba dm mpath: do not fail paths after integrity errors ... Browse Code »

Integrity errors need to be passed to the owner of the integrity
metadata for processing. Consequently EILSEQ should be passed up the
stack.

Cc: stable@kernel.org
Signed-off-by: Martin K. Petersen
Acked-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Martin K. Petersen
2011-05-29 20:02:55 +0800

24 Mar, 2011

2 commits

a490a07a6 dm mpath: allow table load with no priority groups ... Browse Code »

This patch adjusts the multipath target to allow a table with both 0
priority groups and 0 for the initial priority group number.

If any mpath device is held open when all paths in the last priority
group have failed, userspace multipathd will attempt to reload the
associated DM table to reflect the fact that the device no longer has
any priority groups. But the reload attempt always failed because the
multipath target did not allow 0 priority groups.

All multipath target messages related to priority group (enable_group,
disable_group, switch_group) will handle a priority group of 0 (will
cause error).

When reloading a multipath table with 0 priority groups, userspace
multipathd must be updated to specify an initial priority group number
of 0 (rather than 1).

Signed-off-by: Mike Snitzer
Cc: Babu Moger
Acked-by: Hannes Reinecke
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2011-03-24 21:54:33 +0800
19040c0bc dm mpath: fail message ioctl if specified path is not valid ... Browse Code »

Fail the reinstate_path and fail_path message ioctl if the specified
path is not valid.

The message ioctl would succeed for the 'reinistate_path' and
'fail_path' messages even if action was not taken because the
specified device was not a valid path of the multipath device.

Before, when /dev/vdb is not a path of mpathb:
$ dmsetup message mpathb 0 reinstate_path /dev/vdb
$ echo $?
0

After:
$ dmsetup message mpathb 0 reinstate_path /dev/vdb
device-mapper: message ioctl failed: Invalid argument
Command failed
$ echo $?
1

Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2011-03-24 21:54:31 +0800

13 Feb, 2011

1 commit

751b2a7d6 [SCSI] dm mpath: propagate target errors immediately ... Browse Code »

DM now has more information about the nature of the underlying storage
failure. Path failure is avoided if a request failed due to a target
error. Instead the target error is immediately passed up the stack.

Discard requests that fail due to non-target errors may now be retried.

Errors restricted to the path will be retried or returned if no
paths are available, irregarding the no_path_retry setting.

Signed-off-by: Mike Snitzer
Signed-off-by: Hannes Reinecke
Acked-by: Alasdair G Kergon
Signed-off-by: James Bottomley

Hannes Reinecke
2011-02-13 00:33:29 +0800

14 Jan, 2011

4 commits

4e2d19e46 dm mpath: delay activate_path retry on SCSI_DH_RETRY ... Browse Code »

This patch adds a user-configurable 'pg_init_delay_msecs' feature. Use
this feature to specify the number of milliseconds to delay before
retrying scsi_dh_activate, when SCSI_DH_RETRY is returned.

SCSI Device Handlers return SCSI_DH_IMM_RETRY if we could retry
activation immediately and SCSI_DH_RETRY in cases where it is better to
retry after some delay.

Currently we immediately retry scsi_dh_activate irrespective of
SCSI_DH_IMM_RETRY and SCSI_DH_RETRY.

The 'pg_init_delay_msecs' feature may be provided during table create or
load, e.g.:
dmsetup create --table "0 20971520 multipath 3 queue_if_no_path \
pg_init_delay_msecs 2500 ..." mpatha

The default for 'pg_init_delay_msecs' is 2000 milliseconds.
Maximum configurable delay is 60000 milliseconds. Specifying a
'pg_init_delay_msecs' of 0 will cause immediate retry.

Signed-off-by: Nikanth Karthikesan
Signed-off-by: Chandra Seetharaman
Acked-by: Mike Christie
Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Chandra Seetharaman
2011-01-14 04:00:01 +0800
4d4d66ab5 dm: convert workqueues to alloc_ordered ... Browse Code »

Convert all create[_singlethread]_work() users to the new
alloc[_ordered]_workqueue(). This conversion is mechanical and
doesn't introduce any behavior change.

Signed-off-by: Tejun Heo
Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Tejun Heo
2011-01-14 03:59:57 +0800
d5ffa387e dm: dont use flush_scheduled_work ... Browse Code »

flush_scheduled_work() is being deprecated. Flush the used work
directly instead. In all dm targets, the only work which uses
system_wq is ->trigger_event.

Signed-off-by: Tejun Heo
Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Tejun Heo
2011-01-14 03:59:56 +0800
09c9d4c9b dm mpath: disable blk_abort_queue ... Browse Code »

Revert commit 224cb3e981f1b2f9f93dbd49eaef505d17d894c2
dm: Call blk_abort_queue on failed paths

Multipath began to use blk_abort_queue() to allow for
lower latency path deactivation. This was found to
cause list corruption:

the cmd gets blk_abort_queued/timedout run on it and the scsi eh
somehow is able to complete and run scsi_queue_insert while
scsi_request_fn is still trying to process the request.

https://www.redhat.com/archives/dm-devel/2010-November/msg00085.html

Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon
Cc: Mike Anderson
Cc: Mike Christie
Cc: stable@kernel.org

Mike Snitzer
2011-01-14 03:59:46 +0800

12 Aug, 2010

2 commits

959eb4e55 dm mpath: support discard ... Browse Code »

Enable discard support in the DM multipath target.

This discard support depends on a few discard-specific fixes to the
block layer's request stacking driver methods.

Discard requests are optional so don't allow a failed discard to trigger
path failures. If there is a real problem with a given path the
barriers associated with the discard (either before or after the
discard) will cause path failure. That said, unconditionally passing
discard failures up the stack is not ideal. This must be fixed once DM
has more information about the nature of the underlying storage failure.

Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon
Cc: Kiyoshi Ueda

Mike Snitzer
2010-08-12 11:14:32 +0800
6bbf79a14 dm mpath: fix NULL pointer dereference when path parameters missing ... Browse Code »

multipath_ctr() forgets to return an error after detecting
missing path parameters. Fix this.

Signed-off-by: Patrick LoPresti
Cc: stable@kernel.org
Signed-off-by: Alasdair G Kergon

Alasdair G Kergon
2010-08-12 11:13:49 +0800

06 Mar, 2010

6 commits

8215d6ec5 dm table: remove unused dm_get_device range parameters ... Browse Code »

Remove unused parameters(start and len) of dm_get_device()
and fix the callers.

Signed-off-by: Nikanth Karthikesan
Signed-off-by: Alasdair G Kergon

Nikanth Karthikesan
2010-03-06 10:32:27 +0800
fb6126429 dm mpath: refactor pg_init ... Browse Code »

This patch pulls the pg_init path activation code out of
process_queued_ios() into a new function.

No functional change.

Signed-off-by: Kiyoshi Ueda
Signed-off-by: Jun'ichi Nomura
Signed-off-by: Alasdair G Kergon

Kiyoshi Ueda
2010-03-06 10:32:18 +0800
2bded7bd7 dm mpath: wait for pg_init completion when suspending ... Browse Code »

When suspending the device we must wait for all I/O to complete, but
pg-init may be still in progress even after flushing the workqueue
for kmpath_handlerd in multipath_postsuspend.

This patch waits for pg-init completion correctly in
multipath_postsuspend().

Signed-off-by: Kiyoshi Ueda
Signed-off-by: Jun'ichi Nomura
Signed-off-by: Alasdair G Kergon

Kiyoshi Ueda
2010-03-06 10:32:13 +0800
d0259bf0e dm mpath: hold io until all pg_inits completed ... Browse Code »

m->queue_io is set to block processing I/Os, and it needs to be kept
while pg-init, which issues multiple path activations, is in progress.
But m->queue is cleared when a path activation completes without error
in pg_init_done(), even while other path activations are in progress.
That may cause undesired -EIO on paths which are not complete activation.

This patch fixes that by not clearing m->queue_io until all path
activations complete.

(Before the hardware handlers were moved into the SCSI layer, pg_init
only used one path.)

Signed-off-by: Kiyoshi Ueda
Signed-off-by: Jun'ichi Nomura
Signed-off-by: Alasdair G Kergon

Kiyoshi Ueda
2010-03-06 10:30:02 +0800
fce323dd6 dm mpath: avoid storing private suspended state ... Browse Code »

'suspended' flag in struct multipath was introduced to check whether
the multipath target is in suspended state, but the same check is
done through dm_suspended() now, so remove the flag and related code.

Signed-off-by: Kiyoshi Ueda
Signed-off-by: Jun'ichi Nomura
Cc: Mike Anderson
Signed-off-by: Alasdair G Kergon

Kiyoshi Ueda
2010-03-06 10:29:59 +0800
f7b934c81 dm mpath: skip activate_path for failed paths ... Browse Code »

This patch adds two minor fixes while processing device mapper path activation.

Skip failed paths while calling activate_path. If the path is already failed
then activate_path will fail for sure. We don't have to call in that case. In
some case this might cause prolonged retries unnecessarily.

Change the misleading message if the path being activated fails with SCSI_DH_NOSYS.

Signed-off-by: Babu Moger
Signed-off-by: Alasdair G Kergon

Moger, Babu
2010-03-06 10:29:49 +0800