Doug / smarc-fsl-linux-kernel | Embedian Git Server

04 Jun, 2009

1 commit

2cc3c559f Btrfs: set device->total_disk_bytes when adding new device ... Browse Code »

It was not being properly initialized, and so the size saved to
disk was not correct.

Signed-off-by: Chris Mason

Yan Zheng
2009-06-04 21:23:57 +0800

27 Apr, 2009

1 commit

d6397baee Btrfs: When shrinking, only update disk size on success ... Browse Code »

Previously, we updated a device's size prior to attempting a shrink
operation. This patch moves the device resizing logic to only happen if
the shrink completes successfully. In the process, it introduces a new
field to btrfs_device -- disk_total_bytes -- to track the on-disk size.

Signed-off-by: Chris Ball
Signed-off-by: Chris Mason

Chris Ball
2009-04-27 19:40:51 +0800

21 Apr, 2009

1 commit

ffbd517d5 Btrfs: use WRITE_SYNC for synchronous writes ... Browse Code »

Part of reducing fsync/O_SYNC/O_DIRECT latencies is using WRITE_SYNC for
writes we plan on waiting on in the near future. This patch
mirrors recent changes in other filesystems and the generic code to
use WRITE_SYNC when WB_SYNC_ALL is passed and to use WRITE_SYNC for
other latency critical writes.

Btrfs uses async worker threads for checksumming before the write is done,
and then again to actually submit the bios. The bio submission code just
runs a per-device list of bios that need to be sent down the pipe.

This list is split into low priority and high priority lists so the
WRITE_SYNC IO happens first.

Signed-off-by: Chris Mason

Chris Mason
2009-04-21 03:53:08 +0800

03 Apr, 2009

2 commits

bedf762ba Btrfs: unplug in the async bio submission threads ... Browse Code »

Btrfs pages being written get set to writeback, and then may go through
a number of steps before they hit the block layer. This includes compression,
checksumming and async bio submission.

The end result is that someone who writes a page and then does
wait_on_page_writeback is likely to unplug the queue before the bio they
cared about got there.

We could fix this by marking bios sync, or by doing more frequent unplugs,
but this commit just changes the async bio submission code to unplug
after it has processed all the bios for a device. The async bio submission
does a fair job of collection bios, so this shouldn't be a huge problem
for reducing merging at the elevator.

For streaming O_DIRECT writes on a 5 drive array, it boosts performance
from 386MB/s to 460MB/s.

Thanks to Hisashi Hifumi for helping with this work.

Signed-off-by: Chris Mason

Chris Mason
2009-04-03 22:32:58 +0800
b765ead57 Btrfs: keep processing bios for a given bdev if our proc is batching ... Browse Code »

Btrfs uses async helper threads to submit write bios so the checksumming
helper threads don't block on the disk.

The submit bio threads may process bios for more than one block device,
so when they find one device congested they try to move on to other
devices instead of blocking in get_request_wait for one device.

This does a pretty good job of keeping multiple devices busy, but the
congested flag has a number of problems. A congested device may still
give you a request, and other procs that aren't backing off the congested
device may starve you out.

This commit uses the io_context stored in current to decide if our process
has been made a batching process by the block layer. If so, it keeps
sending IO down for at least one batch. This helps make sure we do
a good amount of work each time we visit a bdev, and avoids large IO
stalls in multi-device workloads.

It's also very ugly. A better solution is in the works with Jens Axboe.

Signed-off-by: Chris Mason

Chris Mason
2009-04-03 22:27:10 +0800

11 Mar, 2009

2 commits

913d952eb Btrfs: Clear space_info full when adding new devices ... Browse Code »

The full flag on the space info structs tells the allocator not to try
and allocate more chunks because the devices in the FS are fully allocated.

When more devices are added, we need to clear the full flag so the allocator
knows it has more space available.

Signed-off-by: Chris Mason

Chris Mason
2009-03-11 01:17:18 +0800
4184ea7f9 Btrfs: Fix locking around adding new space_info ... Browse Code »

Storage allocated to different raid levels in btrfs is tracked by
a btrfs_space_info structure, and all of the current space_infos are
collected into a list_head.

Most filesystems have 3 or 4 of these structs total, and the list is
only changed when new raid levels are added or at unmount time.

This commit adds rcu locking on the list head, and properly frees
things at unmount time. It also clears the space_info->full flag
whenever new space is added to the FS.

The locking for the space info list goes like this:

reads: protected by rcu_read_lock()
writes: protected by the chunk_mutex

At unmount time we don't need special locking because all the readers
are gone.

Signed-off-by: Chris Mason

Chris Mason
2009-03-11 00:39:20 +0800

13 Feb, 2009

1 commit

4008c04a0 Btrfs: make a lockdep class for the extent buffer locks ... Browse Code »

Btrfs is currently using spin_lock_nested with a nested value based
on the tree depth of the block. But, this doesn't quite work because
the max tree depth is bigger than what spin_lock_nested can deal with,
and because locks are sometimes taken before the level field is filled in.

The solution here is to use lockdep_set_class_and_name instead, and to
set the class before unlocking the pages when the block is read from the
disk and just after init of a freshly allocated tree block.

btrfs_clear_path_blocking is also changed to take the locks in the proper
order, and it also makes sure all the locks currently held are properly
set to blocking before it tries to retake the spinlocks. Otherwise, lockdep
gets upset about bad lock orderin.

The lockdep magic cam from Peter Zijlstra

Signed-off-by: Chris Mason

Chris Mason
2009-02-13 03:09:45 +0800

12 Feb, 2009

1 commit

3f3420df5 Btrfs: fs/btrfs/volumes.c: remove useless kzalloc ... Browse Code »

The call to kzalloc is followed by a kmalloc whose result is stored in the
same variable.

The semantic match that finds the problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)

//
@r exists@
local idexpression x;
statement S;
expression E;
identifier f,l;
position p1,p2;
expression *ptr != NULL;
@@

(
if ((x@p1 = \(kmalloc\|kzalloc\|kcalloc\)(...)) == NULL) S
|
x@p1 = \(kmalloc\|kzalloc\|kcalloc\)(...);
...
if (x == NULL) S
)
}
x->f = E
...>
(
return \(0\|\|ptr\);
|
return@p2 ...;
)

@script:python@
p1 << r.p1;
p2 << r.p2;
@@

print "* file: %s kmalloc %s return %s" % (p1[0].file,p1[0].line,p2[0].line)
//

Signed-off-by: Julia Lawall
Signed-off-by: Chris Mason

Julia Lawall
2009-02-12 23:16:03 +0800

04 Feb, 2009

1 commit

a68370515 Btrfs: Catch missed bios in the async bio submission thread ... Browse Code »

The async bio submission thread was missing some bios that were
added after it had decided there was no work left to do.

Signed-off-by: Chris Mason

Chris Mason
2009-02-04 22:19:41 +0800

21 Jan, 2009

3 commits

c6e308713 Btrfs: simplify iteration codes ... Browse Code »

Merge list_for_each* and list_entry to list_for_each_entry*

Signed-off-by: Qinghuang Feng
Signed-off-by: Chris Mason

Qinghuang Feng
2009-01-21 23:59:08 +0800
119e10cf1 Btrfs: Remove extra KERN_INFO in the middle of a line ... Browse Code »

The "devid transid " printk in btrfs_scan_one_device()
actually follows another printk that doesn't end in a newline (since the
intention is for the two printks to make one line of output), so the
KERN_INFO just ends up messing up the output:

device label exp devid 1 transid 9 /dev/sda5

Fix this by changing the extra KERN_INFO to KERN_CONT.

Signed-off-by: Roland Dreier
Signed-off-by: Chris Mason

Roland Dreier
2009-01-21 23:49:16 +0800
7eaebe7d5 Btrfs: removed unused #include <version.h>'s ... Browse Code »

Removed unused #include 's in btrfs

Signed-off-by: Huang Weiyi
Signed-off-by: Chris Mason

Huang Weiyi
2009-01-21 23:49:16 +0800

17 Jan, 2009

1 commit

1d9e2ae94 Btrfs: Clear the device->running_pending flag before bailing on congestion ... Browse Code »

Btrfs maintains a queue of async bio submissions so the checksumming
threads don't have to wait on get_request_wait. In order to avoid
extra wakeups, this code has a running_pending flag that is used
to tell new submissions they don't need to wake the thread.

When the threads notice congestion on a single device, they
may decide to requeue the job and move on to other devices. This
makes sure the running_pending flag is cleared before the
job is requeued.

It should help avoid IO stalls by making sure the task is woken up
when new submissions come in.

Signed-off-by: Chris Mason

Chris Mason
2009-01-17 00:58:19 +0800

06 Jan, 2009

1 commit

d397712bc Btrfs: Fix checkpatch.pl warnings ... Browse Code »

There were many, most are fixed now. struct-funcs.c generates some warnings
but these are bogus.

Signed-off-by: Chris Mason

Chris Mason
2009-01-06 10:25:51 +0800

12 Dec, 2008

1 commit

e4404d6e8 Btrfs: shared seed device ... Browse Code »

This patch makes seed device possible to be shared by
multiple mounted file systems. The sharing is achieved
by cloning seed device's btrfs_fs_devices structure.
Thanks you,

Signed-off-by: Yan Zheng

Yan Zheng
2008-12-12 23:03:26 +0800

09 Dec, 2008

4 commits

c3027eb55 Btrfs: Add inode sequence number for NFS and reserved space in a few structs ... Browse Code »

This adds a sequence number to the btrfs inode that is increased on
every update. NFS will be able to use that to detect when an inode has
changed, without relying on inaccurate time fields.

While we're here, this also:

Puts reserved space into the super block and inode

Adds a log root transid to the super so we can pick the newest super
based on the fsync log as well as the main transaction ID. For now
the log root transid is always zero, but that'll get fixed.

Adds a starting offset to the dev_item. This will let us do better
alignment calculations if we know the start of a partition on the disk.

Signed-off-by: Chris Mason

Chris Mason
2008-12-09 05:40:21 +0800
934d375ba Btrfs: Use map_private_extent_buffer during generic_bin_search ... Browse Code »

It is possible that generic_bin_search will be called on a tree block
that has not been locked. This happens because cache_block_block skips
locking on the tree blocks.

Since the tree block isn't locked, we aren't allowed to change
the extent_buffer->map_token field. Using map_private_extent_buffer
avoids any changes to the internal extent buffer fields.

Signed-off-by: Chris Mason

Chris Mason
2008-12-09 05:43:10 +0800
a512bbf85 Btrfs: superblock duplication ... Browse Code »

This patch implements superblock duplication. Superblocks
are stored at offset 16K, 64M and 256G on every devices.
Spaces used by superblocks are preserved by the allocator,
which uses a reverse mapping function to find the logical
addresses that correspond to superblocks. Thank you,

Signed-off-by: Yan Zheng

Yan Zheng
2008-12-09 05:46:26 +0800
d20f7043f Btrfs: move data checksumming into a dedicated tree ... Browse Code »

Btrfs stores checksums for each data block. Until now, they have
been stored in the subvolume trees, indexed by the inode that is
referencing the data block. This means that when we read the inode,
we've probably read in at least some checksums as well.

But, this has a few problems:

* The checksums are indexed by logical offset in the file. When
compression is on, this means we have to do the expensive checksumming
on the uncompressed data. It would be faster if we could checksum
the compressed data instead.

* If we implement encryption, we'll be checksumming the plain text and
storing that on disk. This is significantly less secure.

* For either compression or encryption, we have to get the plain text
back before we can verify the checksum as correct. This makes the raid
layer balancing and extent moving much more expensive.

* It makes the front end caching code more complex, as we have touch
the subvolume and inodes as we cache extents.

* There is potentitally one copy of the checksum in each subvolume
referencing an extent.

The solution used here is to store the extent checksums in a dedicated
tree. This allows us to index the checksums by phyiscal extent
start and length. It means:

* The checksum is against the data stored on disk, after any compression
or encryption is done.

* The checksum is stored in a central location, and can be verified without
following back references, or reading inodes.

This makes compression significantly faster by reducing the amount of
data that needs to be checksummed. It will also allow much faster
raid management code in general.

The checksums are indexed by a key with a fixed objectid (a magic value
in ctree.h) and offset set to the starting byte of the extent. This
allows us to copy the checksum items into the fsync log tree directly (or
any other tree), without having to invent a second format for them.

Signed-off-by: Chris Mason

Chris Mason
2008-12-09 05:58:54 +0800

02 Dec, 2008

2 commits

97288f2c7 Btrfs: corret fmode_t annotations ... Browse Code »

Make sure to propagate fmode_t properly and use the right constants for
it.

Signed-off-by: Christoph Hellwig

Christoph Hellwig
2008-12-02 19:36:09 +0800
b2950863c Btrfs: make things static and include the right headers ... Browse Code »

Shut up various sparse warnings about symbols that should be either
static or have their declarations in scope.

Signed-off-by: Christoph Hellwig

Christoph Hellwig
2008-12-02 22:54:17 +0800

20 Nov, 2008

2 commits

4b4e25f2a Btrfs: compat code fixes ... Browse Code »

The btrfs git kernel trees is used to build a standalone tree for
compiling against older kernels. This commit makes the standalone tree
work with 2.6.27

Signed-off-by: Chris Mason

Chris Mason
2008-11-20 23:22:27 +0800
15916de83 Btrfs: Fixes for 2.6.28-rc API changes ... Browse Code »

* open/close_bdev_excl -> open/close_bdev_exclusive
* blkdev_issue_discard takes a GFP mask now
* Fix blkdev_issue_discard usage now that it is enabled

Signed-off-by: Chris Mason

Chris Mason
2008-11-20 10:17:22 +0800

18 Nov, 2008

1 commit

2b82032c3 Btrfs: Seed device support ... Browse Code »

Seed device is a special btrfs with SEEDING super flag
set and can only be mounted in read-only mode. Seed
devices allow people to create new btrfs on top of it.

The new FS contains the same contents as the seed device,
but it can be mounted in read-write mode.

This patch does the following:

1) split code in btrfs_alloc_chunk into two parts. The first part does makes
the newly allocated chunk usable, but does not do any operation that modifies
the chunk tree. The second part does the the chunk tree modifications. This
division is for the bootstrap step of adding storage to the seed device.

2) Update device management code to handle seed device.
The basic idea is: For an FS grown from seed devices, its
seed devices are put into a list. Seed devices are
opened on demand at mounting time. If any seed device is
missing or has been changed, btrfs kernel module will
refuse to mount the FS.

3) make btrfs_find_block_group not return NULL when all
block groups are read-only.

Signed-off-by: Yan Zheng

Yan Zheng
2008-11-18 10:11:30 +0800

13 Nov, 2008

1 commit

7cbd8a839 Btrfs: Add a missing return pointer check ... Browse Code »

Add a missing kzalloc() return pointer check in add_missing_dev().

Signed-off-by: Chris Mason

yanhai zhu
2008-11-13 03:38:54 +0800

08 Nov, 2008

1 commit

5f2cc086c Btrfs: Avoid unplug storms during commit ... Browse Code »

While doing a commit, btrfs makes sure all the metadata blocks
were properly written to disk, calling wait_on_page_writeback for
each page. This writeback happens after allowing another transaction
to start, so it competes for the disk with other processes in the FS.

If the page writeback bit is still set, each wait_on_page_writeback might
trigger an unplug, even though the page might be waiting for checksumming
to finish or might be waiting for the async work queue to submit the
bio.

This trades wait_on_page_writeback for waiting on the extent writeback
bits. It won't trigger any unplugs and substantially improves performance
in a number of workloads.

This also changes the async bio submission to avoid requeueing if there
is only one device. The requeue just wastes CPU time because there are
no other devices to service.

Signed-off-by: Chris Mason

Chris Mason
2008-11-08 07:22:45 +0800

30 Oct, 2008

2 commits

251792013 Btrfs: nuke fs wide allocation mutex V2 ... Browse Code »

This patch removes the giant fs_info->alloc_mutex and replaces it with a bunch
of little locks.

There is now a pinned_mutex, which is used when messing with the pinned_extents
extent io tree, and the extent_ins_mutex which is used with the pending_del and
extent_ins extent io trees.

The locking for the extent tree stuff was inspired by a patch that Yan Zheng
wrote to fix a race condition, I cleaned it up some and changed the locking
around a little bit, but the idea remains the same. Basically instead of
holding the extent_ins_mutex throughout the processing of an extent on the
extent_ins or pending_del trees, we just hold it while we're searching and when
we clear the bits on those trees, and lock the extent for the duration of the
operations on the extent.

Also to keep from getting hung up waiting to lock an extent, I've added a
try_lock_extent so if we cannot lock the extent, move on to the next one in the
tree and we'll come back to that one. I have tested this heavily and it does
not appear to break anything. This has to be applied on top of my
find_free_extent redo patch.

I tested this patch on top of Yan's space reblancing code and it worked fine.
The only thing that has changed since the last version is I pulled out all my
debugging stuff, apparently I forgot to run guilt refresh before I sent the
last patch out. Thank you,

Signed-off-by: Josef Bacik

Josef Bacik
2008-10-30 02:49:05 +0800
c8b978188 Btrfs: Add zlib compression support ... Browse Code »

This is a large change for adding compression on reading and writing,
both for inline and regular extents. It does some fairly large
surgery to the writeback paths.

Compression is off by default and enabled by mount -o compress. Even
when the -o compress mount option is not used, it is possible to read
compressed extents off the disk.

If compression for a given set of pages fails to make them smaller, the
file is flagged to avoid future compression attempts later.

* While finding delalloc extents, the pages are locked before being sent down
to the delalloc handler. This allows the delalloc handler to do complex things
such as cleaning the pages, marking them writeback and starting IO on their
behalf.

* Inline extents are inserted at delalloc time now. This allows us to compress
the data before inserting the inline extent, and it allows us to insert
an inline extent that spans multiple pages.

* All of the in-memory extent representations (extent_map.c, ordered-data.c etc)
are changed to record both an in-memory size and an on disk size, as well
as a flag for compression.

From a disk format point of view, the extent pointers in the file are changed
to record the on disk size of a given extent and some encoding flags.
Space in the disk format is allocated for compression encoding, as well
as encryption and a generic 'other' field. Neither the encryption or the
'other' field are currently used.

In order to limit the amount of data read for a single random read in the
file, the size of a compressed extent is limited to 128k. This is a
software only limit, the disk format supports u64 sized compressed extents.

In order to limit the ram consumed while processing extents, the uncompressed
size of a compressed extent is limited to 256k. This is a software only limit
and will be subject to tuning later.

Checksumming is still done on compressed extents, and it is done on the
uncompressed version of the data. This way additional encodings can be
layered on without having to figure out which encoding to checksum.

Compression happens at delalloc time, which is basically singled threaded because
it is usually done by a single pdflush thread. This makes it tricky to
spread the compression load across all the cpus on the box. We'll have to
look at parallel pdflush walks of dirty inodes at a later time.

Decompression is hooked into readpages and it does spread across CPUs nicely.

Signed-off-by: Chris Mason

Chris Mason
2008-10-30 02:49:59 +0800

04 Oct, 2008

1 commit

a62b94016 Btrfs: cast bio->bi_sector to a u64 before shifting ... Browse Code »

On 32 bit machines without CONFIG_LBD, the bi_sector field is only 32 bits.
Btrfs needs to cast it before shifting up, or we end up doing IO into
the wrong place.

Signed-off-by: Chris Mason

Chris Mason
2008-10-04 04:31:08 +0800

29 Sep, 2008

1 commit

8c8bee1d7 Btrfs: Wait for IO on the block device inodes of newly added devices ... Browse Code »

btrfs-vol -a /dev/xxx will zero the first and last two MB of the device.
The kernel code needs to wait for this IO to finish before it adds
the device.

btrfs metadata IO does not happen through the block device inode. A
separate address space is used, allowing the zero filled buffer heads in
the block device inode to be written to disk after FS metadata starts
going down to the disk via the btrfs metadata inode.

The end result is zero filled metadata blocks after adding new devices
into the filesystem.

The fix is a simple filemap_write_and_wait on the block device inode
before actually inserting it into the pool of available devices.

Signed-off-by: Chris Mason

Chris Mason
2008-09-29 23:19:10 +0800

26 Sep, 2008

2 commits

1a40e23b9 Btrfs: update space balancing code ... Browse Code »

This patch updates the space balancing code to utilize the new
backref format. Before, btrfs-vol -b would break any COW links
on data blocks or metadata. This was slow and caused the amount
of space used to explode if a large number of snapshots were present.

The new code can keeps the sharing of all data extents and
most of the tree blocks.

To maintain the sharing of data extents, the space balance code uses
a seperate inode hold data extent pointers, then updates the references
to point to the new location.

To maintain the sharing of tree blocks, the space balance code uses
reloc trees to relocate tree blocks in reference counted roots.
There is one reloc tree for each subvol, and all reloc trees share
same root key objectid. Reloc trees are snapshots of the latest
committed roots of subvols (root->commit_root).

To relocate a tree block referenced by a subvol, there are two steps.
COW the block through subvol's reloc tree, then update block pointer in
the subvol to point to the new block. Since all reloc trees share
same root key objectid, doing special handing for tree blocks
owned by them is easy. Once a tree block has been COWed in one
reloc tree, we can use the resulting new block directly when the
same block is required to COW again through other reloc trees.
In this way, relocated tree blocks are shared between reloc trees,
so they are also shared between subvols.

Signed-off-by: Chris Mason

Zheng Yan
2008-09-26 22:09:34 +0800
2b1f55b0f Remove Btrfs compat code for older kernels ... Browse Code »

Btrfs had compatibility code for kernels back to 2.6.18. These have
been removed, and will be maintained in a separate backport
git tree from now on.

Signed-off-by: Chris Mason

Chris Mason
2008-09-26 03:41:59 +0800

25 Sep, 2008

7 commits

0f9dd46cd Btrfs: free space accounting redo ... Browse Code »

1) replace the per fs_info extent_io_tree that tracked free space with two
rb-trees per block group to track free space areas via offset and size. The
reason to do this is because most allocations come with a hint byte where to
start, so we can usually find a chunk of free space at that hint byte to satisfy
the allocation and get good space packing. If we cannot find free space at or
after the given offset we fall back on looking for a chunk of the given size as
close to that given offset as possible. When we fall back on the size search we
also try to find a slot as close to the size we want as possible, to avoid
breaking small chunks off of huge areas if possible.

2) remove the extent_io_tree that tracked the block group cache from fs_info and
replaced it with an rb-tree thats tracks block group cache via offset. also
added a per space_info list that tracks the block group cache for the particular
space so we can lookup related block groups easily.

3) cleaned up the allocation code to make it a little easier to read and a
little less complicated. Basically there are 3 steps, first look from our
provided hint. If we couldn't find from that given hint, start back at our
original search start and look for space from there. If that fails try to
allocate space if we can and start looking again. If not we're screwed and need
to start over again.

4) small fixes. there were some issues in volumes.c where we wouldn't allocate
the rest of the disk. fixed cow_file_range to actually pass the alloc_hint,
which has helped a good bit in making the fs_mark test I run have semi-normal
results as we run out of space. Generally with data allocations we don't track
where we last allocated from, so everytime we did a data allocation we'd search
through every block group that we have looking for free space. Now searching a
block group with no free space isn't terribly time consuming, it was causing a
slight degradation as we got more data block groups. The alloc_hint has fixed
this slight degredation and made things semi-normal.

There is still one nagging problem I'm working on where we will get ENOSPC when
there is definitely plenty of space. This only happens with metadata
allocations, and only when we are almost full. So you generally hit the 85%
mark first, but sometimes you'll hit the BUG before you hit the 85% wall. I'm
still tracking it down, but until then this seems to be pretty stable and make a
significant performance gain.

Signed-off-by: Chris Mason

Josef Bacik
2008-09-25 23:04:07 +0800
325cd4baf Btrfs: properly set blocksize when adding new device. ... Browse Code »

---

Signed-off-by: Chris Mason

Zheng Yan
2008-09-25 23:04:07 +0800
a1b32a593 Btrfs: Add debugging checks to track down corrupted metadata ... Browse Code »

Signed-off-by: Chris Mason

Chris Mason
2008-09-25 23:04:07 +0800
9473f16c7 Btrfs: Throttle for async bio submits higher up the chain ... Browse Code »

The current code waits for the count of async bio submits to get below
a given threshold if it is too high right after adding the latest bio
to the work queue. This isn't optimal because the caller may have
sequential adjacent bios pending they are waiting to send down the pipe.

This changeset requires the caller to wait on the async bio count,
and changes the async checksumming submits to wait for async bios any
time they self throttle.

The end result is much higher sequential throughput.

Signed-off-by: Chris Mason

Chris Mason
2008-09-25 23:04:07 +0800
b64a2851b Btrfs: Wait for async bio submissions to make some progress at queue time ... Browse Code »

Before, the btrfs bdi congestion function was used to test for too many
async bios. This keeps that check to throttle pdflush, but also
adds a check while queuing bios.

Signed-off-by: Chris Mason

Chris Mason
2008-09-25 23:04:06 +0800
0986fe9ea Btrfs: Count async bios separately from async checksum work items ... Browse Code »

Signed-off-by: Chris Mason

Chris Mason
2008-09-25 23:04:06 +0800
7d2b4daa6 Btrfs: Fix the multi-bio code to save the original bio for completion ... Browse Code »

The multi-bio code is responsible for duplicating blocks in raid1 and
single spindle duplication. It has counters to make sure all of
the locations for a given extent are properly written before io completion
is returned to the higher layers.

But, it didn't always complete the same bio it was given, sometimes a
clone was completed instead. This lead to problems with the async
work queues because they saved a pointer to the bio in a struct off
bi_private.

The fix is to remember the original bio and only complete that one.

Signed-off-by: Chris Mason

Chris Mason
2008-09-25 23:04:06 +0800