Blame view

Documentation/device-mapper/dm-raid.txt 9.54 KB
c0a2fa1ef   Jonathan Brassow   dm raid: improve ...
1
  dm-raid
be83651f0   Jonathan Brassow   DM RAID: Add mess...
2
  =======
9d09e663d   NeilBrown   dm: raid456 basic...
3

c0a2fa1ef   Jonathan Brassow   dm raid: improve ...
4
5
6
  The device-mapper RAID (dm-raid) target provides a bridge from DM to MD.
  It allows the MD RAID drivers to be accessed using a device-mapper
  interface.
9d09e663d   NeilBrown   dm: raid456 basic...
7

be83651f0   Jonathan Brassow   DM RAID: Add mess...
8
9
10
  
  Mapping Table Interface
  -----------------------
c0a2fa1ef   Jonathan Brassow   dm raid: improve ...
11
12
13
14
15
16
  The target is named "raid" and it accepts the following parameters:
  
    <raid_type> <#raid_params> <raid_params> \
      <#raid_devs> <metadata_dev0> <dev0> [.. <metadata_devN> <devN>]
  
  <raid_type>:
b12d437b7   Jonathan Brassow   dm raid: support ...
17
    raid1		RAID1 mirroring
c0a2fa1ef   Jonathan Brassow   dm raid: improve ...
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
    raid4		RAID4 dedicated parity disk
    raid5_la	RAID5 left asymmetric
  		- rotating parity 0 with data continuation
    raid5_ra	RAID5 right asymmetric
  		- rotating parity N with data continuation
    raid5_ls	RAID5 left symmetric
  		- rotating parity 0 with data restart
    raid5_rs 	RAID5 right symmetric
  		- rotating parity N with data restart
    raid6_zr	RAID6 zero restart
  		- rotating parity zero (left-to-right) with data restart
    raid6_nr	RAID6 N restart
  		- rotating parity N (right-to-left) with data restart
    raid6_nc	RAID6 N continue
  		- rotating parity N (right-to-left) with data continuation
63f33b8dd   Jonathan Brassow   DM RAID: Add supp...
33
34
35
    raid10        Various RAID10 inspired algorithms chosen by additional params
  		- RAID10: Striped Mirrors (aka 'Striping on top of mirrors')
  		- RAID1E: Integrated Adjacent Stripe Mirroring
fe5d2f4a1   Jonathan Brassow   DM RAID: Add supp...
36
  		- RAID1E: Integrated Offset Stripe Mirroring
63f33b8dd   Jonathan Brassow   DM RAID: Add supp...
37
  		-  and other similar RAID10 variants
c0a2fa1ef   Jonathan Brassow   dm raid: improve ...
38

40e47125e   Masanari Iida   Documentation: Fi...
39
    Reference: Chapter 4 of
c0a2fa1ef   Jonathan Brassow   dm raid: improve ...
40
41
42
43
44
45
46
47
48
49
50
51
    http://www.snia.org/sites/default/files/SNIA_DDF_Technical_Position_v2.0.pdf
  
  <#raid_params>: The number of parameters that follow.
  
  <raid_params> consists of
      Mandatory parameters:
          <chunk_size>: Chunk size in sectors.  This parameter is often known as
  		      "stripe size".  It is the only mandatory parameter and
  		      is placed first.
  
      followed by optional parameters (in any order):
  	[sync|nosync]   Force or prevent RAID initialization.
be83651f0   Jonathan Brassow   DM RAID: Add mess...
52
  	[rebuild <idx>]	Rebuild drive number 'idx' (first drive is 0).
c0a2fa1ef   Jonathan Brassow   dm raid: improve ...
53
54
55
56
57
58
59
60
  
  	[daemon_sleep <ms>]
  		Interval between runs of the bitmap daemon that
  		clear bits.  A longer interval means less bitmap I/O but
  		resyncing after a failure is likely to take longer.
  
  	[min_recovery_rate <kB/sec/disk>]  Throttle RAID initialization
  	[max_recovery_rate <kB/sec/disk>]  Throttle RAID initialization
be83651f0   Jonathan Brassow   DM RAID: Add mess...
61
62
63
  	[write_mostly <idx>]		   Mark drive index 'idx' write-mostly.
  	[max_write_behind <sectors>]       See '--write-behind=' (man mdadm)
  	[stripe_cache <sectors>]           Stripe cache size (RAID 4/5/6 only)
c1084561b   Jonathan Brassow   dm raid: add regi...
64
65
66
67
  	[region_size <sectors>]
  		The region_size multiplied by the number of regions is the
  		logical size of the array.  The bitmap records the device
  		synchronisation state for each region.
c0a2fa1ef   Jonathan Brassow   dm raid: improve ...
68

63f33b8dd   Jonathan Brassow   DM RAID: Add supp...
69
          [raid10_copies   <# copies>]
fe5d2f4a1   Jonathan Brassow   DM RAID: Add supp...
70
          [raid10_format   <near|far|offset>]
63f33b8dd   Jonathan Brassow   DM RAID: Add supp...
71
72
  		These two options are used to alter the default layout of
  		a RAID10 configuration.  The number of copies is can be
fe5d2f4a1   Jonathan Brassow   DM RAID: Add supp...
73
74
75
76
77
78
  		specified, but the default is 2.  There are also three
  		variations to how the copies are laid down - the default
  		is "near".  Near copies are what most people think of with
  		respect to mirroring.  If these options are left unspecified,
  		or 'raid10_copies 2' and/or 'raid10_format near' are given,
  		then the layouts for 2, 3 and 4 devices	are:
63f33b8dd   Jonathan Brassow   DM RAID: Add supp...
79
80
81
82
83
84
85
86
87
88
89
  		2 drives         3 drives          4 drives
  		--------         ----------        --------------
  		A1  A1           A1  A1  A2        A1  A1  A2  A2
  		A2  A2           A2  A3  A3        A3  A3  A4  A4
  		A3  A3           A4  A4  A5        A5  A5  A6  A6
  		A4  A4           A5  A6  A6        A7  A7  A8  A8
  		..  ..           ..  ..  ..        ..  ..  ..  ..
  		The 2-device layout is equivalent 2-way RAID1.  The 4-device
  		layout is what a traditional RAID10 would look like.  The
  		3-device layout is what might be called a 'RAID1E - Integrated
  		Adjacent Stripe Mirroring'.
fe5d2f4a1   Jonathan Brassow   DM RAID: Add supp...
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
  		If 'raid10_copies 2' and 'raid10_format far', then the layouts
  		for 2, 3 and 4 devices are:
  		2 drives             3 drives             4 drives
  		--------             --------------       --------------------
  		A1  A2               A1   A2   A3         A1   A2   A3   A4
  		A3  A4               A4   A5   A6         A5   A6   A7   A8
  		A5  A6               A7   A8   A9         A9   A10  A11  A12
  		..  ..               ..   ..   ..         ..   ..   ..   ..
  		A2  A1               A3   A1   A2         A2   A1   A4   A3
  		A4  A3               A6   A4   A5         A6   A5   A8   A7
  		A6  A5               A9   A7   A8         A10  A9   A12  A11
  		..  ..               ..   ..   ..         ..   ..   ..   ..
  
  		If 'raid10_copies 2' and 'raid10_format offset', then the
  		layouts for 2, 3 and 4 devices are:
  		2 drives       3 drives           4 drives
  		--------       ------------       -----------------
  		A1  A2         A1  A2  A3         A1  A2  A3  A4
  		A2  A1         A3  A1  A2         A2  A1  A4  A3
  		A3  A4         A4  A5  A6         A5  A6  A7  A8
  		A4  A3         A6  A4  A5         A6  A5  A8  A7
  		A5  A6         A7  A8  A9         A9  A10 A11 A12
  		A6  A5         A9  A7  A8         A10 A9  A12 A11
  		..  ..         ..  ..  ..         ..  ..  ..  ..
  		Here we see layouts closely akin to 'RAID1E - Integrated
  		Offset Stripe Mirroring'.
c0a2fa1ef   Jonathan Brassow   dm raid: improve ...
116
117
118
  <#raid_devs>: The number of devices composing the array.
  	Each device consists of two entries.  The first is the device
  	containing the metadata (if any); the second is the one containing the
b12d437b7   Jonathan Brassow   dm raid: support ...
119
  	data.
c0a2fa1ef   Jonathan Brassow   dm raid: improve ...
120
121
122
  
  	If a drive has failed or is missing at creation time, a '-' can be
  	given for both the metadata and data drives for a given position.
be83651f0   Jonathan Brassow   DM RAID: Add mess...
123
  Example Tables
c0a2fa1ef   Jonathan Brassow   dm raid: improve ...
124
  --------------
b12d437b7   Jonathan Brassow   dm raid: support ...
125
  # RAID4 - 4 data drives, 1 parity (no metadata devices)
9d09e663d   NeilBrown   dm: raid456 basic...
126
127
128
  # No metadata devices specified to hold superblock/bitmap info
  # Chunk size of 1MiB
  # (Lines separated for easy reading)
c0a2fa1ef   Jonathan Brassow   dm raid: improve ...
129

9d09e663d   NeilBrown   dm: raid456 basic...
130
131
132
  0 1960893648 raid \
          raid4 1 2048 \
          5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81
b12d437b7   Jonathan Brassow   dm raid: support ...
133
  # RAID4 - 4 data drives, 1 parity (with metadata devices)
9d09e663d   NeilBrown   dm: raid456 basic...
134
135
  # Chunk size of 1MiB, force RAID initialization,
  #       min recovery rate at 20 kiB/sec/disk
c0a2fa1ef   Jonathan Brassow   dm raid: improve ...
136

9d09e663d   NeilBrown   dm: raid456 basic...
137
  0 1960893648 raid \
b12d437b7   Jonathan Brassow   dm raid: support ...
138
139
          raid4 4 2048 sync min_recovery_rate 20 \
          5 8:17 8:18 8:33 8:34 8:49 8:50 8:65 8:66 8:81 8:82
9d09e663d   NeilBrown   dm: raid456 basic...
140

be83651f0   Jonathan Brassow   DM RAID: Add mess...
141
142
143
  
  Status Output
  -------------
c0a2fa1ef   Jonathan Brassow   dm raid: improve ...
144
  'dmsetup table' displays the table used to construct the mapping.
46bed2b5c   Jonathan Brassow   dm raid: add writ...
145
  The optional parameters are always printed in the order listed
c0a2fa1ef   Jonathan Brassow   dm raid: improve ...
146
147
  above with "sync" or "nosync" always output ahead of the other
  arguments, regardless of the order used when originally loading the table.
46bed2b5c   Jonathan Brassow   dm raid: add writ...
148
  Arguments that can be repeated are ordered by value.
9d09e663d   NeilBrown   dm: raid456 basic...
149

be83651f0   Jonathan Brassow   DM RAID: Add mess...
150
151
152
153
  
  'dmsetup status' yields information on the state and health of the array.
  The output is as follows (normally a single line, but expanded here for
  clarity):
9d09e663d   NeilBrown   dm: raid456 basic...
154
  1: <s> <l> raid \
be83651f0   Jonathan Brassow   DM RAID: Add mess...
155
156
  2:      <raid_type> <#devices> <health_chars> \
  3:      <sync_ratio> <sync_action> <mismatch_cnt>
9d09e663d   NeilBrown   dm: raid456 basic...
157

c0a2fa1ef   Jonathan Brassow   dm raid: improve ...
158
  Line 1 is the standard output produced by device-mapper.
be83651f0   Jonathan Brassow   DM RAID: Add mess...
159
160
  Line 2 & 3 are produced by the raid target and are best explained by example:
          0 1960893648 raid raid4 5 AAAAA 2/490221568 init 0
9d09e663d   NeilBrown   dm: raid456 basic...
161
  Here we can see the RAID type is raid4, there are 5 devices - all of
be83651f0   Jonathan Brassow   DM RAID: Add mess...
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
  which are 'A'live, and the array is 2/490221568 complete with its initial
  recovery.  Here is a fuller description of the individual fields:
  	<raid_type>     Same as the <raid_type> used to create the array.
  	<health_chars>  One char for each device, indicating: 'A' = alive and
  			in-sync, 'a' = alive but not in-sync, 'D' = dead/failed.
  	<sync_ratio>    The ratio indicating how much of the array has undergone
  			the process described by 'sync_action'.  If the
  			'sync_action' is "check" or "repair", then the process
  			of "resync" or "recover" can be considered complete.
  	<sync_action>   One of the following possible states:
  			idle    - No synchronization action is being performed.
  			frozen  - The current action has been halted.
  			resync  - Array is undergoing its initial synchronization
  				  or is resynchronizing after an unclean shutdown
  				  (possibly aided by a bitmap).
  			recover - A device in the array is being rebuilt or
  				  replaced.
  			check   - A user-initiated full check of the array is
  				  being performed.  All blocks are read and
  				  checked for consistency.  The number of
  				  discrepancies found are recorded in
  				  <mismatch_cnt>.  No changes are made to the
  				  array by this action.
  			repair  - The same as "check", but discrepancies are
  				  corrected.
  			reshape - The array is undergoing a reshape.
  	<mismatch_cnt>  The number of discrepancies found between mirror copies
  			in RAID1/10 or wrong parity values found in RAID4/5/6.
  			This value is valid only after a "check" of the array
  			is performed.  A healthy array has a 'mismatch_cnt' of 0.
  
  Message Interface
  -----------------
  The dm-raid target will accept certain actions through the 'message' interface.
  ('man dmsetup' for more information on the message interface.)  These actions
  include:
  	"idle"   - Halt the current sync action.
  	"frozen" - Freeze the current sync action.
  	"resync" - Initiate/continue a resync.
  	"recover"- Initiate/continue a recover process.
  	"check"  - Initiate a check (i.e. a "scrub") of the array.
  	"repair" - Initiate a repair of the array.
  	"reshape"- Currently unsupported (-EINVAL).
4ec1e369a   Jonathan Brassow   DM RAID: Add rebu...
205
206
207
208
209
210
211
212
  
  Version History
  ---------------
  1.0.0	Initial version.  Support for RAID 4/5/6
  1.1.0	Added support for RAID 1
  1.2.0	Handle creation of arrays that contain failed devices.
  1.3.0	Added support for RAID 10
  1.3.1	Allow device replacement/rebuild for RAID 10
55ebbb59c   Jonathan Brassow   DM-RAID: Fix RAID...
213
  1.3.2   Fix/improve redundancy checking for RAID10
fe5d2f4a1   Jonathan Brassow   DM RAID: Add supp...
214
  1.4.0	Non-functional change.  Removes arg from mapping function.
be83651f0   Jonathan Brassow   DM RAID: Add mess...
215
216
217
218
  1.4.1   RAID10 fix redundancy validation checks (commit 55ebbb5).
  1.4.2   Add RAID10 "far" and "offset" algorithm support.
  1.5.0   Add message interface to allow manipulation of the sync_action.
  	New status (STATUSTYPE_INFO) fields: sync_action and mismatch_cnt.
9092c02d9   Jonathan Brassow   DM RAID: Add abil...
219
  1.5.1   Add ability to restore transiently failed devices on resume.
c4a395514   Jonathan Brassow   MD: Remember the ...
220
  1.5.2   'mismatch_cnt' is zero unless [last_]sync_action is "check".