Eric Lee / smarc-fsl-linux-kernel

07 Feb, 2008

3 commits

d089c6af1 md: change ITERATE_RDEV to rdev_for_each ... Browse Code »

As this is more in line with common practice in the kernel. Also swap the
args around to be more like list_for_each.

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2008-02-07 02:41:19 +0800
c62072777 md: allow a maximum extent to be set for resyncing ... Browse Code »

This allows userspace to control resync/reshape progress and synchronise it
with other activities, such as shared access in a SAN, or backing up critical
sections during a tricky reshape.

Writing a number of sectors (which must be a multiple of the chunk size if
such is meaningful) causes a resync to pause when it gets to that point.

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2008-02-07 02:41:18 +0800
b47490c9b md: Update md bitmap during resync. ... Browse Code »

Currently an md array with a write-intent bitmap does not updated that bitmap
to reflect successful partial resync. Rather the entire bitmap is updated
when the resync completes.

This is because there is no guarentee that resync requests will complete in
order, and tracking each request individually is unnecessarily burdensome.

However there is value in regularly updating the bitmap, so add code to
periodically pause while all pending sync requests complete, then update the
bitmap. Doing this only every few seconds (the same as the bitmap update
time) does not notciably affect resync performance.

[snitzer@gmail.com: export bitmap_cond_end_sync]
Signed-off-by: Neil Brown
Cc: "Mike Snitzer"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2008-02-07 02:41:18 +0800

09 Nov, 2007

1 commit

2ad8b1ef1 Add UNPLUG traces to all appropriate places ... Browse Code »

Added blk_unplug interface, allowing all invocations of unplugs to result
in a generated blktrace UNPLUG.

Signed-off-by: Alan D. Brunelle
Signed-off-by: Jens Axboe

Alan D. Brunelle
2007-11-09 20:41:32 +0800

16 Oct, 2007

1 commit

fd5d80626 block: convert blkdev_issue_flush() to use empty barriers ... Browse Code »

Then we can get rid of ->issue_flush_fn() and all the driver private
implementations of that.

Signed-off-by: Jens Axboe

Jens Axboe
2007-10-16 17:05:02 +0800

10 Oct, 2007

1 commit

6712ecf8f Drop 'size' argument from bio_endio and bi_end_io ... Browse Code »

As bi_end_io is only called once when the reqeust is complete,
the 'size' argument is now redundant. Remove it.

Now there is no need for bio_endio to subtract the size completed
from bi_size. So don't do that either.

While we are at it, change bi_end_io to return void.

Signed-off-by: Neil Brown
Signed-off-by: Jens Axboe

NeilBrown
2007-10-10 15:25:57 +0800

01 Aug, 2007

2 commits

f6f953aa9 md: handle writes to broken raid10 arrays gracefully ... Browse Code »

When writing to a broken array, raid10 currently happily emits empty bio
lists. IOW, the master bio will never be completed, sending writers to
UNINTERRUPTIBLE_SLEEP forever.

Signed-off-by: Arne Redlich
Acked-by: Neil Brown
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arne Redlich
2007-08-01 06:39:38 +0800
14e713446 md: raid10: fix use-after-free of bio ... Browse Code »

In case of read errors raid10d tries to print a nice error message,
unfortunately using data from an already put bio.

Signed-off-by: Maik Hampel
Acked-By: NeilBrown
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Maik Hampel
2007-08-01 06:39:38 +0800

24 Jul, 2007

1 commit

165125e1e [BLOCK] Get rid of request_queue_t typedef ... Browse Code »

Some of the code has been gradually transitioned to using the proper
struct request_queue, but there's lots left. So do a full sweet of
the kernel and get rid of this typedef and replace its uses with
the proper type.

Signed-off-by: Jens Axboe

Jens Axboe
2007-07-24 15:28:11 +0800

18 Jul, 2007

1 commit

4ad136637 md: change bitmap_unplug and others to void functions ... Browse Code »

bitmap_unplug only ever returns 0, so it may as well be void. Two callers try
to print a message if it returns non-zero, but that message is already printed
by bitmap_file_kick.

write_page returns an error which is not consistently checked. It always
causes BITMAP_WRITE_ERROR to be set on an error, and that can more
conveniently be checked.

When the return of write_page is checked, an error causes bitmap_file_kick to
be called - so move that call into write_page - and protect against recursive
calls into bitmap_file_kick.

bitmap_update_sb returns an error that is never checked.

So make these 'void' and be consistent about checking the bit.

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2007-07-18 01:23:15 +0800

17 Jun, 2007

1 commit

af03b8e4e md: fix two raid10 bugs ... Browse Code »

1/ When resyncing a degraded raid10 which has more than 2 copies of each block,
garbage can get synced on top of good data.

2/ We round the wrong way in part of the device size calculation, which
can cause confusion.

Signed-off-by: Neil Brown
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2007-06-17 04:16:15 +0800

02 Mar, 2007

1 commit

64a742bc6 [PATCH] md: fix raid10 recovery problem. ... Browse Code »

There are two errors that can lead to recovery problems with raid10
when used in 'far' more (not the default).

Due to a '>' instead of '>=' the wrong block is located which would result in
garbage being written to some random location, quite possible outside the
range of the device, causing the newly reconstructed device to fail.

The device size calculation had some rounding errors (it didn't round when it
should) and so recovery would go a few blocks too far which would again cause
a write to a random block address and probably a device error.

The code for working with device sizes was fairly confused and spread out, so
this has been tided up a bit.

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2007-03-02 06:53:36 +0800

12 Jan, 2007

1 commit

e3881a681 [PATCH] md: pass down BIO_RW_SYNC in raid{1,10} ... Browse Code »

md raidX make_request functions strip off the BIO_RW_SYNC flag, thus
introducing additional latency.

Fixing this in raid1 and raid10 seems to be straightforward enough.

For our particular usage case in DRBD, passing this flag improved some
initialization time from ~5 minutes to ~5 seconds.

Acked-by: NeilBrown
Signed-off-by: Lars Ellenberg
Acked-by: Jens Axboe
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Lars Ellenberg
2007-01-12 10:18:21 +0800

14 Dec, 2006

1 commit

802ba064c [PATCH] md: Don't assume that READ==0 and WRITE==1 - use the names explicitly ... Browse Code »

Thanks Jens for alerting me to this.

Cc: Jens Axboe
Cc:
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-12-14 01:05:48 +0800

29 Oct, 2006

1 commit

969b755aa [PATCH] md: fix printk format warnings, seen on powerpc64: ... Browse Code »

drivers/md/raid1.c:1479: warning: long long unsigned int format, long unsigned int arg (arg 4)
drivers/md/raid10.c:1475: warning: long long unsigned int format, long unsigned int arg (arg 4)

Signed-off-by: Randy Dunlap
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Randy Dunlap
2006-10-29 02:30:52 +0800

22 Oct, 2006

1 commit

2e333e898 [PATCH] md: fix calculation of ->degraded for multipath and raid10 ... Browse Code »

Two less-used md personalities have bugs in the calculation of ->degraded (the
extent to which the array is degraded).

Signed-off-by: Neil Brown
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-10-22 04:35:05 +0800

03 Oct, 2006

5 commits

0d1292282 [PATCH] md: define ->congested_fn for raid1, raid10, and multipath ... Browse Code »

raid1, raid10 and multipath don't report their 'congested' status through
bdi_*_congested, but should.

This patch adds the appropriate functions which just check the 'congested'
status of all active members (with appropriate locking).

raid1 read_balance should be modified to prefer devices where
bdi_read_congested returns false. Then we could use the '&' branch rather
than the '|' branch. However that should would need some benchmarking first
to make sure it is actually a good idea.

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-10-03 23:04:18 +0800
c04be0aa8 [PATCH] md: Improve locking around error handling ... Browse Code »

The error handling routines don't use proper locking, and so two concurrent
errors could trigger a problem.

So:
- use test-and-set and test-and-clear to synchonise
the In_sync bits with the ->degraded count
- use the spinlock to protect updates to the
degraded count (could use an atomic_t but that
would be a bigger change in code, and isn't
really justified)
- remove un-necessary locking in raid5

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-10-03 23:04:18 +0800
76186dd8b [PATCH] md: remove 'working_disks' from raid10 state ... Browse Code »

It isn't needed as mddev->degraded contains equivalent info.

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-10-03 23:04:17 +0800
850b2b420 [PATCH] md: replace magic numbers in sb_dirty with well defined bit flags ... Browse Code »

Instead of magic numbers (0,1,2,3) in sb_dirty, we have
some flags instead:
MD_CHANGE_DEVS
Some device state has changed requiring superblock update
on all devices.
MD_CHANGE_CLEAN
The array has transitions from 'clean' to 'dirty' or back,
requiring a superblock update on active devices, but possibly
not on spares
MD_CHANGE_PENDING
A superblock update is underway.

We wait for an update to complete by waiting for all flags to be clear. A
flag can be set at any time, even during an update, without risk that the
change will be lost.

Stop exporting md_update_sb - isn't needed.

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-10-03 23:04:17 +0800
6814d5368 [PATCH] md: factor out part of raid10d into a separate function. ... Browse Code »

raid10d has toooo many nested block, so take the fix_read_error functionality
out into a separate function.

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-10-03 23:04:17 +0800

11 Jul, 2006

1 commit

d69504325 [PATCH] md: include sector number in messages about corrected read errors ... Browse Code »

This is generally useful, but particularly helps see if it is the same sector
that always needs correcting, or different ones.

[akpm@osdl.org: fix printk warnings]
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-07-11 04:24:17 +0800

27 Jun, 2006

4 commits

883883283 [PATCH] md: Calculate correct array size for raid10 in new offset mode ... Browse Code »

The size calculation made assumtion which the new offset mode didn't
follow. This gets the size right in all cases.

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-06-27 00:58:39 +0800
c93983bf5 [PATCH] md: support stripe/offset mode in raid10 ... Browse Code »

The "industry standard" DDF format allows for a stripe/offset layout where
data is duplicated on different stripes. e.g.

A B C D
D A B C
E F G H
H E F G

(columns are drives, rows are stripes, LETTERS are chunks of data).

This is similar to raid10's 'far' mode, but not quite the same. So enhance
'far' mode with a 'far/offset' option which follows the layout of DDFs
stripe/offset.

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-06-27 00:58:37 +0800
5fd6c1dce [PATCH] md: allow checkpoint of recovery with version-1 superblock ... Browse Code »

For a while we have had checkpointing of resync. The version-1 superblock
allows recovery to be checkpointed as well, and this patch implements that.

Due to early carelessness we need to add a feature flag to signal that the
recovery_offset field is in use, otherwise older kernels would assume that a
partially recovered array is in fact fully recovered.

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-06-27 00:58:37 +0800
8932c2e0d [PATCH] md: remove arbitrary limit on chunk size ... Browse Code »

The largest chunk size the code can support without substantial surgery is
2^30 bytes, so make that the limit instead of an arbitrary 4Meg. Some day,
the 'chunksize' should change to a sector-shift instead of a byte-count. Then
no limit would be needed.

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-06-27 00:58:36 +0800

02 May, 2006

2 commits

e0a33270e [PATCH] md: Fixed refcounting/locking when attempting read error correction in raid10 ... Browse Code »

We need to hold a reference to rdevs while reading and writing to attempt to
correct read errors. This reference must be taken under an rcu lock.

Signed-off-by: Neil Brown
Cc: "Paul E. McKenney"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-05-02 09:17:42 +0800
df30d0f4c [PATCH] md: Avoid oops when attempting to fix read errors on raid10 ... Browse Code »

We should add to the counter for the rdev *after* checking if the rdev is
NULL!!!

Signed-off-by: Neil Brown
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-05-02 09:17:42 +0800

02 Apr, 2006

1 commit

b63854838 BUG_ON() Conversion in md/raid10.c ... Browse Code »

this changes if() BUG(); constructs to BUG_ON() which is
cleaner and can better optimized away

Signed-off-by: Eric Sesterhenn
Signed-off-by: Adrian Bunk

Eric Sesterhenn
2006-04-02 19:34:29 +0800

04 Feb, 2006

1 commit

29fc7e3e7 [PATCH] md: Assorted little md fixes ... Browse Code »

- version-1 superblock
+ The default_bitmap_offset is in sectors, not bytes.
+ the 'size' field in the superblock is in sectors, not KB
- raid0_run should return a negative number on error, not '1'
- raid10_read_balance should not return a valid 'disk' number if
->rdev turned out to be NULL
- kmem_cache_destroy doesn't like being passed a NULL.

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-02-04 00:32:00 +0800

15 Jan, 2006

1 commit

858119e15 [PATCH] Unlinline a bunch of other functions ... Browse Code »

Remove the "inline" keyword from a bunch of big functions in the kernel with
the goal of shrinking it by 30kb to 40kb

Signed-off-by: Arjan van de Ven
Signed-off-by: Ingo Molnar
Acked-by: Jeff Garzik
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arjan van de Ven
2006-01-15 10:27:06 +0800

07 Jan, 2006

9 commits

4dbcdc751 [PATCH] md: count corrected read errors per drive ... Browse Code »

Store this total in superblock (As appropriate), and make it available to
userspace via sysfs.

Signed-off-by: Neil Brown
Acked-by: Greg KH
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-01-07 00:34:09 +0800
d9d166c2a [PATCH] md: allow array level to be set textually via sysfs ... Browse Code »

Signed-off-by: Neil Brown
Acked-by: Greg KH
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-01-07 00:34:09 +0800
f188593ee [PATCH] md: fix typo in comment ... Browse Code »

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-01-07 00:34:07 +0800
1345b1d8a [PATCH] md: define and use safe_put_page for md ... Browse Code »

md sometimes call put_page on NULL pointers (treating it like kfree). This is
not safe, so define and use a 'safe_put_page' which checks for NULL.

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-01-07 00:34:07 +0800
097426f68 [PATCH] md: fix possible problem in raid1/raid10 error overwriting ... Browse Code »

The code to overwrite/reread for addressing read errors in raid1/raid10
currently assumes that the read will not alter the buffer which could be used
to write to the next device. This is not a safe assumption to make.

So we split the loops into a overwrite loop and a separate re-read loop, so
that the writing is complete before reading is attempted.

Cc: Paul Clements
Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-01-07 00:34:06 +0800
2604b703b [PATCH] md: remove personality numbering from md ... Browse Code »

md supports multiple different RAID level, each being implemented by a
'personality' (which is often in a separate module).

These personalities have fairly artificial 'numbers'. The numbers
are use to:
1- provide an index into an array where the various personalities
are recorded
2- identify the module (via an alias) which implements are particular
personality.

Neither of these uses really justify the existence of personality numbers.
The array can be replaced by a linked list which is searched (array lookup
only happens very rarely). Module identification can be done using an alias
based on level rather than 'personality' number.

The current 'raid5' modules support two level (4 and 5) but only one
personality. This slight awkwardness (which was handled in the mapping from
level to personality) can be better handled by allowing raid5 to register 2
personalities.

With this change in place, the core md module does not need to have an
exhaustive list of all possible personalities, so other personalities can be
added independently.

This patch also moves the check for chunksize being non-zero into the ->run
routines for the personalities that need it, rather than having it in core-md.
This has a side effect of allowing 'faulty' and 'linear' not to have a
chunk-size set.

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-01-07 00:34:06 +0800
a24a8dd85 [PATCH] md: break out of a loop that doesn't need to run to completion ... Browse Code »

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-01-07 00:34:06 +0800
9ffae0cf3 [PATCH] md: convert md to use kzalloc throughout ... Browse Code »

Replace multiple kmalloc/memset pairs with kzalloc calls.

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-01-07 00:34:05 +0800
2d1f3b5d1 [PATCH] md: clean up 'page' related names in md ... Browse Code »

Substitute:

page_cache_get -> get_page
page_cache_release -> put_page
PAGE_CACHE_SHIFT -> PAGE_SHIFT
PAGE_CACHE_SIZE -> PAGE_SIZE
PAGE_CACHE_MASK -> PAGE_MASK
__free_page -> put_page

because we aren't using the page cache, we are just using pages.

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-01-07 00:34:05 +0800