01 Nov, 2005
1 commit
-
Instead of having ->read_sectors and ->write_sectors, combine the two
into ->sectors[2] and similar for the other fields. This saves a branch
several places in the io path, since we don't have to care for what the
actual io direction is. On my x86-64 box, that's 200 bytes less text in
just the core (not counting the various drivers).Signed-off-by: Jens Axboe
30 Oct, 2005
1 commit
-
This patch uses sg_set_buf/sg_init_one in some places where it was
duplicated.Signed-off-by: David Hardeman
Cc: James Bottomley
Cc: Greg KH
Cc: "David S. Miller"
Cc: Jeff Garzik
Signed-off-by: Andrew Morton
Signed-off-by: Herbert Xu
28 Oct, 2005
1 commit
-
Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds
27 Oct, 2005
1 commit
-
There are still a couple of cases where md threads (the resync/recovery
thread) is not interruptible since the change to use kthreads. All places
there it tests "signal_pending", it should also test kthread_should_stop,
as with this patch.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
20 Oct, 2005
1 commit
-
The main problem fixes is that in certain situations stopping md arrays may
take longer than you expect, or may require multiple attempts. This would
only happen when resync/recovery is happening.This patch fixes three vaguely related bugs.
1/ The recent change to use kthreads got the setting of the
process name wrong. This fixes it.
2/ The recent change to use kthreads lost the ability for
md threads to be signalled with SIG_KILL. This restores that.
3/ There is a long standing bug in that if:
- An array needs recovery (onto a hot-spare) and
- The recovery is being blocked because some other array being
recovered shares a physical device and
- The recovery thread is killed with SIG_KILL
Then the recovery will appear to have completed with no IO being
done, which can cause data corruption.
This patch makes sure that incomplete recovery will be treated as
incomplete.Note that any kernel affected by bug 2 will not suffer the problem of bug
3, as the signal can never be delivered. Thus the current 2.6.14-rc
kernels are not susceptible to data corruption. Note also that if arrays
are shutdown (with "mdadm -S" or "raidstop") then the problem doesn't
occur. It only happens if a SIGKILL is independently delivered as done by
'init' when shutting down.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
09 Oct, 2005
1 commit
-
- added typedef unsigned int __nocast gfp_t;
- replaced __nocast uses for gfp flags with gfp_t - it gives exactly
the same warnings as far as sparse is concerned, doesn't change
generated code (from gcc point of view we replaced unsigned int with
typedef) and documents what's going on far better.Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds
28 Sep, 2005
2 commits
-
When creating a multipath device, if the queue_if_no_path parameter is
specified it gets ignored.While the queue_if_no_path variable is correctly set to 1, the
saved_queue_if_no_path gets set to 0. When the device is subsequently made
live (resumed), the saved value (0) always overwrites the live value (1) so
the option *always* gets turned off.The fix adds a parameter to the queue_if_no_path() function to indicate
whether the previous value should be preserved or not - if not, as when the
device is being set up, the saved value is set to the new value (1).Signed-Off-By: Alasdair G Kergon
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
If anything is waiting on a device's table when the device is removed, we
must first wake it up so it will release its reference. Otherwise the
table's reference count will not drop to zero and the table will not get
removed.Signed-Off-By: Alasdair G Kergon
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
18 Sep, 2005
1 commit
-
This patch fixes a signedness bug with RAID6 for Altivec, and makes the
Altivec code testable in userspace.Signed-off-by: H. Peter Anvin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
11 Sep, 2005
2 commits
-
This patch contains the most trivial from Rusty's trivial patches:
- spelling fixes
- remove duplicate includesSigned-off-by: Adrian Bunk
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This patch does a full cleanup of 'NULL checks before vfree', and a partial
cleanup of calls to kfree for all of drivers/ - the kfree bit is partial in
that I only did the files that also had vfree calls in them. The patch
also gets rid of some redundant (void *) casts of pointers being passed to
[vk]free, and a some tiny whitespace corrections also crept in.Signed-off-by: Jesper Juhl
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
10 Sep, 2005
29 commits
-
This shouldn't be a BUG. We should cope.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
If you try to assemble an array with too many missing devices, raid10 will now
reject the attempt, instead of allowing it.Also check when hot-adding a drive and refuse the hot-add if the array is
beyond hope.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
There was another case where sb_size wasn't being set, so instead do the
sensible thing and set if when filling in the content of a superblock. That
ensures that whenever we write a superblock, the sb_size MUST be set.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
…-existing superblock.
There are two ways to add devices to an md/raid array.
It can have superblock written to it, and then given to the md driver,
which will read the superblock (the new way)or
md can be told (through SET_ARRAY_INFO) the shape of the array, and
the told about individual drives, and md will create the required
superblock (the old way).The newly introduced sb_size was only set for drives being added the
new way, not the old ways. Oops :-(Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org> -
Just like failed drives have (F), so spare drives now have (S).
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Leave it unchanged if the original (0.90) is used, incase it might be a
compatability problem.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Doh. I want the physical hard-sector-size, not the current block size...
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
On reflection, a better default location for hot-adding bitmaps with version-1
superblocks is immediately after the superblock. There might not be much room
there, but there is usually atleast 3k, and that is a good start.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The bitmap code used to have two daemons, so there is some 'common' start/stop
code. But now there is only one, so the common code is just noise.This patch tidies this up somewhat.
Signed-off-by: Neil Brown
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
mddev->bitmap gets clearred before the writeback daemon is stopped. So the
write_back daemon needs to be careful not to dereference the 'bitmap' if it is
NULL.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Switch MD to use the kthread infrastructure, to simplify the code and get rid
of tasklist_lock abuse in md_unregister_thread.Also don't flush signals in md_thread, as the called thread will always do
that.Signed-off-by: Christoph Hellwig
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This is a direct port of the raid5 patch.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Most awkward part of this is delaying write requests until bitmap updates have
been flushed.To achieve this, we have a sequence number (seq_flush) which is incremented
each time the raid5 is unplugged.If the raid thread notices that this has changed, it flushes bitmap changes,
and assigned the value of seq_flush to seq_write.When a write request arrives, it is given the number from seq_write, and that
write request may not complete until seq_flush is larger than the saved seq
number.We have a new queue for storing stripes which are waiting for a bitmap flush
and an extra flag for stripes to record if the write was 'degraded' and so
should not clear the a bit in the bitmap.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
version-1 superblocks are not (normally) 4K long, and can be of variable size.
Writing the full 4K can cause corruption (but only in non-default
configurations).With this patch the super-block-flavour can choose a size to read, and set a
size to write based on what it finds.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
read_sb_page() assumed that if sync_page_io fails, the device would be marked
faultly. However it isn't. So in the face of error, read_sb_page would loop
forever.Redo the logic so that this cannot happen.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
As this is used to flag an internal bitmap.
Also, introduce symbolic names for feature bits.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
It is possibly (and occasionally useful) to have a raid1 without persistent
superblocks. The code in add_new_disk for adding a device to such an array
always tries to read a superblock.This will obviously fail.
So do the appropriate test and call md_import_device with
appropriate args.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
When hot-adding a bitmap, bitmap_daemon_work could get called while the bitmap
is being created, so don't set mddev->bitmap until the bitmap is ready.This requires freeing the bitmap inside bitmap_create if creation failed
part-way through.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The 'lastrun' time wasn't being initialised, so it could be half a
jiffie-cycle before it seemed to be time to do work again.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
A state of 0 mean 'not quiesced'
A state of 1 means 'is quiesced'The original code got this wrong.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
linear currently uses division by the size of the smallest componenet device
to find which device a request goes to. If that smallest device is larger
than 2 terabytes, then the division will not work on some systems.So we introduce a pre-shift, and take care not to make the hash table too
large, much like the code in raid0.Also get rid of conf->nr_zones, which is not needed.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
If a device is flagged 'WriteMostly' and the array has a bitmap, and the
bitmap superblock indicates that write_behind is allowed, then write_behind is
enabled for WriteMostly devices.Write requests will be acknowledges as complete to the caller (via b_end_io)
when all non-WriteMostly devices have completed the write, but will not be
cleared from the bitmap until all devices complete.This requires memory allocation to make a local copy of the data being
written. If there is insufficient memory, then we fall-back on normal write
semantics.Signed-Off-By: Paul Clements
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This allows a device in a raid1 to be marked as "write mostly". Read requests
will only be sent if there is no other option.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Both file-bitmaps and superblock bitmaps are supported.
If you add a bitmap file on the array device, you lose.
This introduces a 'default_bitmap_offset' field in mddev, as the ioctl used
for adding a superblock bitmap doesn't have room for giving an offset. Later,
this value will be setable via sysfs.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
When we find a 'stale' bitmap, possibly because it is new, we should just
assume every bit needs to be set, but rather base the setting of bits on the
current state of the array (degraded and recovery_cp).Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
... otherwise we loose a reference and can never free the file.
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Fix another bug in dm-raid1.c that the dirty region may stay in or be moved
to clean list and freed while in use.It happens as follows:
CPU0 CPU1
------------------------------------------------------------------------------
rh_dec()
if (atomic_dec_and_test(pending))
rh_inc()
if the region is clean
mark the region dirty
and remove from clean list
mark the region clean
and move to clean list
atomic_inc(pending)At this stage, the region is in clean list and will be mistakenly reclaimed
by rh_update_states() later.Signed-off-by: Jun'ichi Nomura
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
md does not yet support BIO_RW_BARRIER, so be honest about it and fail
(-EOPNOTSUPP) any such requests.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
'this_sector' is a virtual (array) address while 'head_position' is a physical
(device) address, so substraction doesn't make any sense. devs[slot].addr
should be used instead of this_sector.However, this patch doesn't make much practical different to the read
balancing due to the effects of later code.Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds