Doug / smarc-fsl-linux-kernel | Embedian Git Server

13 Dec, 2012

4 commits

29a8d9a0b Btrfs: introduce GET_READ_MIRRORS functionality for btrfs_map_block() ... Browse Code »

Before this commit, btrfs_map_block() was called with REQ_WRITE
in order to retrieve the list of mirrors for a disk block.
This needs to be changed for the device replace procedure since
it makes a difference whether you are asking for read mirrors
or for locations to write to.
GET_READ_MIRRORS is introduced as a new interface to call
btrfs_map_block().
In the current commit, the functionality is not yet changed,
only the interface for GET_READ_MIRRORS is introduced and all
the places that should use this new interface are adapted.

The reason that REQ_WRITE cannot be abused anymore to retrieve
a list of read mirrors is that during a running dev replace
operation all write requests to the live filesystem are
duplicated to also write to the target drive.
Keep in mind that the target disk is only partially a valid
copy of the source disk while the operation is ongoing. All
writes go to the target disk, but not all reads would return
valid data on the target disk. Therefore it is not possible
anymore to abuse a REQ_WRITE interface to find valid mirrors
for a REQ_READ.

Signed-off-by: Stefan Behrens
Signed-off-by: Chris Mason

Stefan Behrens
2012-12-13 06:15:43 +0800
8dabb7420 Btrfs: change core code of btrfs to support the device replace operations ... Browse Code »

This commit contains all the essential changes to the core code
of Btrfs for support of the device replace procedure.

Signed-off-by: Stefan Behrens
Signed-off-by: Chris Mason

Stefan Behrens
2012-12-13 06:15:42 +0800
ff023aac3 Btrfs: add code to scrub to copy read data to another disk ... Browse Code »

The device replace procedure makes use of the scrub code. The scrub
code is the most efficient code to read the allocated data of a disk,
i.e. it reads sequentially in order to avoid disk head movements, it
skips unallocated blocks, it uses read ahead mechanisms, and it
contains all the code to detect and repair defects.
This commit adds code to scrub to allow the scrub code to copy read
data to another disk.
One goal is to be able to perform as fast as possible. Therefore the
write requests are collected until huge bios are built, and the
write process is decoupled from the read process with some kind of
flow control, of course, in order to limit the allocated memory.
The best performance on spinning disks could by reached when the
head movements are avoided as much as possible. Therefore a single
worker is used to interface the read process with the write process.
The regular scrub operation works as fast as before, it is not
negatively influenced and actually it is more or less unchanged.

Signed-off-by: Stefan Behrens
Signed-off-by: Chris Mason

Stefan Behrens
2012-12-13 06:15:41 +0800
3ec706c83 Btrfs: pass fs_info to btrfs_map_block() instead of mapping_tree ... Browse Code »

This is required for the device replace procedure in a later step.
Two calling functions also had to be changed to have the fs_info
pointer: repair_io_failure() and scrub_setup_recheck_block().

Signed-off-by: Stefan Behrens
Signed-off-by: Chris Mason

Stefan Behrens
2012-12-13 06:15:34 +0800

03 Oct, 2012

1 commit

99621b44a btrfs: reada_extent doesn't need kref for refcount ... Browse Code »

All increments and decrements are under the same spinlock - have to be,
since they need to protect the radix_tree it's found in. Just use
int, no need to wank with kref...

Signed-off-by: Al Viro

Al Viro
2012-10-03 09:35:55 +0800

30 May, 2012

1 commit

3d136a113 Btrfs: set ioprio of scrub readahead to idle ... Browse Code »

Reduce ioprio class of scrub readahead threads to idle priority.
This setting is fixed. This priority has shown the best performance
during all measurements.

Signed-off-by: Stefan Behrens

Stefan Behrens
2012-05-30 22:23:43 +0800

19 Apr, 2012

2 commits

207a232cc btrfs: don't add both copies of DUP to reada extent tree ... Browse Code »

Normally when there are 2 copies of a block, we add both to the
reada extent tree and prefetch only the one that is easier to reach.
This way we can better utilize multiple devices.
In case of DUP this makes no sense as both copies reside on the
same device.

Signed-off-by: Arne Jansen

Arne Jansen
2012-04-19 01:12:44 +0800
8c9c2bf7a btrfs: fix race in reada ... Browse Code »

When inserting into the radix tree returns EEXIST, get the existing
entry without giving up the spinlock in between.
There was a race for both the zones trees and the extent tree.

Signed-off-by: Arne Jansen

Arne Jansen
2012-04-19 01:12:44 +0800

28 Mar, 2012

1 commit

94598ba8d Btrfs: introduce common define for max number of mirrors ... Browse Code »

Readahead already has a define for the max number of mirrors. Scrub
needs such a define now, the rest of the code will need something
like this soon. Therefore the define was added to ctree.h and removed
from the readahead code.

Signed-off-by: Stefan Behrens
Signed-off-by: Chris Mason

Stefan Behrens
2012-03-28 02:21:26 +0800

03 Mar, 2012

1 commit

a175423c8 Btrfs: fix casting error in scrub reada code ... Browse Code »

The reada code from scrub was casting down a u64 to
an unsigned long so it could insert it into a radix tree.

What it really wanted to do was cast down the result of a shift, instead
of casting down the u64. The bug resulted in trying to insert our
reada struct into the wrong place, which caused soft lockups and other
problems.

Signed-off-by: Chris Mason

Chris Mason
2012-03-03 20:42:35 +0800

06 Nov, 2011

3 commits

21ca543ef Btrfs: rename btrfs_bio multi -> bbio for consistency ... Browse Code »

Signed-off-by: Chris Mason

Ilya Dryomov
2011-11-06 16:11:21 +0800
9510dc4c6 Btrfs: stop leaking btrfs_bios on readahead ... Browse Code »

Signed-off-by: Chris Mason

Ilya Dryomov
2011-11-06 16:11:08 +0800
806468f8b Merge git://git.jan-o-sch.net/btrfs-unstable into integration ... Browse Code »

Conflicts:
fs/btrfs/Makefile
fs/btrfs/extent_io.c
fs/btrfs/extent_io.h
fs/btrfs/scrub.c

Signed-off-by: Chris Mason

Chris Mason
2011-11-06 16:07:10 +0800

02 Oct, 2011

1 commit

7414a03fb btrfs: initial readahead code and prototypes ... Browse Code »

This is the implementation for the generic read ahead framework.

To trigger a readahead, btrfs_reada_add must be called. It will start
a read ahead for the given range [start, end) on tree root. The returned
handle can either be used to wait on the readahead to finish
(btrfs_reada_wait), or to send it to the background (btrfs_reada_detach).

The read ahead works as follows:
On btrfs_reada_add, the root of the tree is inserted into a radix_tree.
reada_start_machine will then search for extents to prefetch and trigger
some reads. When a read finishes for a node, all contained node/leaf
pointers that lie in the given range will also be enqueued. The reads will
be triggered in sequential order, thus giving a big win over a naive
enumeration. It will also make use of multi-device layouts. Each disk
will have its on read pointer and all disks will by utilized in parallel.
Also will no two disks read both sides of a mirror simultaneously, as this
would waste seeking capacity. Instead both disks will read different parts
of the filesystem.
Any number of readaheads can be started in parallel. The read order will be
determined globally, i.e. 2 parallel readaheads will normally finish faster
than the 2 started one after another.

Changes v2:
- protect root->node by transaction instead of node_lock
- fix missed branches:
The readahead had a too simple check to determine if a branch from
a node should be checked or not. It now also records the upper bound
of each node to see if the requested RA range lies within.
- use KERN_CONT to debug output, to avoid line breaks
- defer reada_start_machine to worker to avoid deadlock

Changes v3:
- protect root->node by rcu

Changes v5:
- changed EIO-semantics of reada_tree_block_flagged
- remove spin_lock from reada_control and make elems an atomic_t
- remove unused read_total from reada_control
- kill reada_key_cmp, use btrfs_comp_cpu_keys instead
- use kref-style release functions where possible
- return struct reada_control * instead of void * from btrfs_reada_add

Signed-off-by: Arne Jansen

Arne Jansen
2011-10-02 14:48:44 +0800