07 Jan, 2006
40 commits
-
Signed-off-by: Neil Brown
Acked-by: Greg KH
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This the role that a device has in an array can be viewed and set.
Signed-off-by: Neil Brown
Acked-by: Greg KH
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Move the checks - that dev size is never less than array size - into
bind_rdev_to_array to make sure it always happens properly (there is one place
where currently it doesn't).Also reject any superblock which claims an array size smaller than the device
in question can hold.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
If array is active, try to reshape, else just set the value.
Signed-off-by: Neil Brown
Acked-by: Greg KH
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Store this total in superblock (As appropriate), and make it available to
userspace via sysfs.Signed-off-by: Neil Brown
Acked-by: Greg KH
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Neil Brown
Acked-by: Greg KH
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Allow it to be set to a particular version, or 'none'.
Signed-off-by: Neil Brown
Acked-by: Greg KH
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Neil Brown
Acked-by: Greg KH
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
... only before array is started of course.
Signed-off-by: Neil Brown
Acked-by: Greg KH
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
When we do a user-requested check/repair, we lose count of the outstanding
requests...Also make sure that when anything is written to md/sync_action, the
RECOVERY_NEEDED flag is set and the thread is woken up so any changes take
effect.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
When we update a page_cache page in the kernel, we need to flush_dache_page or
userspace might not see the change.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Make the needlessly global function md_new_event() static.
Signed-off-by: Adrian Bunk
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
.. because they aren't used outside md.c
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Commands written to sysfs files may, or my not, be \n terminated. We want to
accept with case. For this we use cmd_match.Signed-off-by: Neil Brown
Acked-by: Greg KH
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
md sometimes call put_page on NULL pointers (treating it like kfree). This is
not safe, so define and use a 'safe_put_page' which checks for NULL.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The kernel should not be imposing these policy limits: The time between
bitmap updates should certainly be allowed to be more than 15 seconds, and
if someone wants a bitmap chunk size in excess of 4MB, the kernel isn't the
place to stop them.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The code to overwrite/reread for addressing read errors in raid1/raid10
currently assumes that the read will not alter the buffer which could be used
to write to the next device. This is not a safe assumption to make.So we split the loops into a overwrite loop and a separate re-read loop, so
that the writing is complete before reading is attempted.Cc: Paul Clements
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
md supports multiple different RAID level, each being implemented by a
'personality' (which is often in a separate module).These personalities have fairly artificial 'numbers'. The numbers
are use to:
1- provide an index into an array where the various personalities
are recorded
2- identify the module (via an alias) which implements are particular
personality.Neither of these uses really justify the existence of personality numbers.
The array can be replaced by a linked list which is searched (array lookup
only happens very rarely). Module identification can be done using an alias
based on level rather than 'personality' number.The current 'raid5' modules support two level (4 and 5) but only one
personality. This slight awkwardness (which was handled in the mapping from
level to personality) can be better handled by allowing raid5 to register 2
personalities.With this change in place, the core md module does not need to have an
exhaustive list of all possible personalities, so other personalities can be
added independently.This patch also moves the check for chunksize being non-zero into the ->run
routines for the personalities that need it, rather than having it in core-md.
This has a side effect of allowing 'faulty' and 'linear' not to have a
chunk-size set.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
...because that seems to be the preferred practice these days.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
- replace open-coded hash chain with hlist macros
- Fix hash-table size at one page - it is already quite generous, so there
will never be a need to use multiple pages, so no need for __get_free_pagesNo functional change.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Replace multiple kmalloc/memset pairs with kzalloc calls.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Substitute:
page_cache_get -> get_page
page_cache_release -> put_page
PAGE_CACHE_SHIFT -> PAGE_SHIFT
PAGE_CACHE_SIZE -> PAGE_SIZE
PAGE_CACHE_MASK -> PAGE_MASK
__free_page -> put_pagebecause we aren't using the page cache, we are just using pages.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
With this patch it is possible to poll /proc/mdstat to detect arrays appearing
or disappearing, to detect failures, recovery starting, recovery completing,
and devices being added and removed.It is similar to the poll-ability of /proc/mounts, though different in that:
We always report that the file is readable (because face it, it is, even if
only for EOF).We report POLLPRI when there is a change so that select() can detect
it as an exceptional event. Not only are these exceptional events, but
that is the mechanism that the current 'mdadm' uses to watch for events
(It also polls after a timeout).
(We also report POLLERR like /proc/mounts).Finally, we only reset the per-file event counter when the start of the file
is read, rather than when poll() returns an event. This is more robust as it
means that an fd will continue to report activity to poll/select until the
program clearly responds to that activity.md_new_event takes an 'mddev' which isn't currently used, but it will be soon.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add in correct read-error handling for resync and read-only situations.
When read-only, we don't over-write, so we need to mark the failed drive in
the r10_bio so we don't re-try it. During resync, we always read all blocks,
so if there is a read error, we simply over-write it with the good block that
we found (assuming we found one).Note that the recovery case still isn't handled in an interesting way. There
is nothing useful to do for the 2-copies case. If there are 3 or more copies,
then we could try reading from one of the non-missing copies, but this is a
bit complicated and very rarely would be used, so I'm leaving it for now.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Largely just a cross-port from raid1.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
We are inadvertently setting the R1BIO_Uptodate bit on read errors when we
decide not to try correcting (because there are no other working devices).
This means that the read error is reported to the client as success.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Where performing a user-requested 'check' or 'repair', we read all readable
devices, and compare the contents. We only write to blocks which had read
errors, or blocks with content that differs from the first good device found.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Also keep count on the number of errors found.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
There is this "FIXME" comment with a typo in it!! that been annoying me for
days, so I just had to remove it.conf->disks[i].rdev should only be accessed if
- we know we hold a reference or
- the mddev->reconfig_sem is down or
- we have a rcu_readlockhandle_stripe was referencing rdev in three places without any of these. For
the first two, get an rcu_readlock. For the last, the same access
(md_sync_acct call) is made a little later after the rdev has been claimed
under and rcu_readlock, if R5_Syncio is set. So just use that access...
However R5_Syncio isn't really needed as the 'syncing' variable contains the
same information. So use that instead.Issues, comment, and fix are identical in raid5 and raid6.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Handling of read errors during resync is separate from handling of read errors
during normal IO in raid1. A previous patch added support for read errors
during normal IO. This one adds support for read errors during resync or
recovery.The key differences are that we don't need to freeze the array, because the
normal handling of resync means that this part of the array will be idle
except for resync, and the read/overwrite/re-read is needed in a separate
piece of code.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
We are dereferencing ->rdev without an rcu lock!
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
On a read-error we suspend the array, then synchronously read the block from
other arrays until we find one where we can read it. Then we try writing the
good data back everywhere and make sure it works. If any write or subsequent
read fails, only then do we fail the device out of the array.To be able to suspend the array, we need to also keep track of how many
requests are queued for handling by raid1d.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This is a simple port of match functionality across from raid5. If we get a
read error, we don't kick the drive straight away, but try to over-write with
good data first.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
raid6 currently does not check the P/Q syndromes when doing a resync, it just
calculates the correct value and writes it. Doing the check can reduce writes
(often to 0) for a resync, and it is needed to properly implement theecho check > sync_action
operation.
This patch implements the appropriate checks and tidies up some related code.
It also allows raid6 user-requested resync to bypass the intent bitmap.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This is important because bitmap_create uses
mddev->resync_max_sectors
and that doesn't have a valid value until after the array
has been initialised (with pers->run()).
[It doesn't make a difference for current personalities that
support bitmaps, but will make a difference for raid10]This has the added advantage of meaning with can move the thread->timeout
manipulation inside the bitmap.c code instead of sprinkling identical code
throughout all personalities.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds