Eric Lee / smarc-fsl-linux-kernel

16 Dec, 2011

1 commit

d85c8a6f1 Btrfs: unplug every once and a while ... Browse Code »

The btrfs io submission threads can build up massive plug lists. This
keeps things more reasonable so we don't hand over huge dumps of IO at
once.

Signed-off-by: Chris Mason

Chris Mason
2011-12-16 04:38:41 +0800

10 Dec, 2011

1 commit

5dbc8fca8 Btrfs: fix btrfs_end_bio to deal with write errors to a single mirror ... Browse Code »

btrfs_end_bio checks the number of errors on a bio against the max
number of errors allowed before sending any EIOs up to the higher
levels.

If we got enough copies of the bio done for a given raid level, it is
supposed to clear the bio error flag and return success.

We have pointers to the original bio sent down by the higher layers and
pointers to any cloned bios we made for raid purposes. If the original
bio happens to be the one that got an io error, but not the last one to
finish, it might not have the BIO_UPTODATE bit set.

Then, when the last bio does finish, we'll call bio_end_io on the
original bio. It won't have the uptodate bit set and we'll end up
sending EIO to the higher layers.

We already had a check for this, it just was conditional on getting the
IO error on the very last bio. Make the check unconditional so we eat
the EIOs properly.

Signed-off-by: Chris Mason

Chris Mason
2011-12-10 00:07:37 +0800

08 Dec, 2011

1 commit

a5d163336 Btrfs: check if the to-be-added device is writable ... Browse Code »

If we call ioctl(BTRFS_IOC_ADD_DEV) directly, we'll succeed in adding
a readonly device to a btrfs filesystem, and btrfs will write to
that device, emitting kernel errors:

[ 3109.833692] lost page write due to I/O error on loop2
[ 3109.833720] lost page write due to I/O error on loop2
...

Signed-off-by: Li Zefan
Signed-off-by: Chris Mason

Li Zefan
2011-12-08 21:55:46 +0800

11 Nov, 2011

1 commit

924cd8fbe Btrfs: fix nocow when deleting the item ... Browse Code »

btrfs_previous_item() just search the b+ tree, do not COW the nodes or leaves,
if we modify the result of it, the meta-data will be broken. fix it.

Signed-off-by: Miao Xie
Signed-off-by: Chris Mason

Miao Xie
2011-11-11 09:45:04 +0800

06 Nov, 2011

3 commits

806468f8b Merge git://git.jan-o-sch.net/btrfs-unstable into integration ... Browse Code »

Conflicts:
fs/btrfs/Makefile
fs/btrfs/extent_io.c
fs/btrfs/extent_io.h
fs/btrfs/scrub.c

Signed-off-by: Chris Mason

Chris Mason
2011-11-06 16:07:10 +0800
531f4b1ae Merge branch 'for-chris' of git://github.com/sensille/linux into integration ... Browse Code »

Conflicts:
fs/btrfs/ctree.h

Signed-off-by: Chris Mason

Chris Mason
2011-11-06 16:05:08 +0800
6c41761fc btrfs: separate superblock items out of fs_info ... Browse Code »

fs_info has now ~9kb, more than fits into one page. This will cause
mount failure when memory is too fragmented. Top space consumers are
super block structures super_copy and super_for_commit, ~2.8kb each.
Allocate them dynamically. fs_info will be ~3.5kb. (measured on x86_64)

Add a wrapper for freeing fs_info and all of it's dynamically allocated
members.

Signed-off-by: David Sterba

David Sterba
2011-11-06 16:04:01 +0800

21 Oct, 2011

1 commit

20bcd6493 Btrfs: close all bdevs on mount failure ... Browse Code »

Fix a bug introduced by 20b45077. We have to return EINVAL on mount
failure, but doing that too early in the sequence leaves all of the
devices opened exclusively. This also fixes an issue where under some
scenarios only a second mount -o degraded command would
succeed.

Signed-off-by: Ilya Dryomov

Ilya Dryomov
2011-10-21 00:20:57 +0800

20 Oct, 2011

1 commit

2bf64758f Btrfs: allow us to overcommit our enospc reservations ... Browse Code »

One of the things that kills us is the fact that our ENOSPC reservations are
horribly over the top in most normal cases. There isn't too much that can be
done about this because when we are completely full we really need them to work
like this so we don't under reserve. However if there is plenty of unallocated
chunks on the disk we can use that to gauge how much we can overcommit. So this
patch adds chunk free space accounting so we always know how much unallocated
space we have. Then if we fail to make a reservation within our allocated
space, check to see if we can overcommit. In the normal flushing case (like
with delalloc metadata reservations) we'll take the free space and divide it by
2 if our metadata profile is setup for DUP or any of those, and then divide it
by 8 to make sure we don't overcommit too much. Then if we're in a non-flushing
case (we really need this reservation now!) we only limit ourselves to half of
the free space. This makes this fio test

[torrent]
filename=torrent-test
rw=randwrite
size=4g
ioengine=sync
directory=/mnt/btrfs-test

go from taking around 45 minutes to 10 seconds on my freshly formatted 3 TiB
file system. This doesn't seem to break my other enospc tests, but could really
use some more testing as this is a super scary change. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2011-10-20 03:12:50 +0800

02 Oct, 2011

1 commit

90519d66a btrfs: state information for readahead ... Browse Code »

Add state information for readahead to btrfs_fs_info and btrfs_device

Changes v2:
- don't wait in radix_trees
- add own set of workers for readahead

Reviewed-by: Josef Bacik
Signed-off-by: Arne Jansen

Arne Jansen
2011-10-02 14:48:30 +0800

29 Sep, 2011

2 commits

2774b2ca3 btrfs: Put mirror_num in bi_bdev ... Browse Code »

The error correction code wants to make sure that only the bad mirror is
rewritten. Thus, we need to know which mirror is the bad one. I did not
find a more apropriate field than bi_bdev. But I think using this is fine,
because it is modified by the block layer, anyway, and should not be read
after the bio returned.

Signed-off-by: Jan Schmidt

Jan Schmidt
2011-09-29 19:38:42 +0800
a1d3c4786 btrfs: btrfs_multi_bio replaced with btrfs_bio ... Browse Code »

btrfs_bio is a bio abstraction able to split and not complete after the last
bio has returned (like the old btrfs_multi_bio). Additionally, btrfs_bio
tracks the mirror_num used to read data which can be used for error
correction purposes.

Signed-off-by: Jan Schmidt

Jan Schmidt
2011-09-29 19:38:42 +0800

17 Aug, 2011

3 commits

0e5888596 Btrfs: fix uninitialized sync_pending ... Browse Code »

sync_pending is uninitialized before it be used, fix it.

Signed-off-by: Miao Xie
Signed-off-by: Chris Mason

Miao Xie
2011-08-17 09:09:31 +0800
38c01b960 Btrfs: fix a bug of balance on full multi-disk partitions ... Browse Code »

When balancing, we'll first try to shrink devices for some space,
but if it is working on a full multi-disk partition with raid protection,
we may encounter a bug, that is, while shrinking, total_bytes may be less
than bytes_used, and btrfs may allocate a dev extent that accesses out of
device's bounds.

Then we will not be able to write or read the data which stores at the end
of the device, and get the followings:

device fsid 0939f071-7ea3-46c8-95df-f176d773bfb6 devid 1 transid 10 /dev/sdb5
Btrfs detected SSD devices, enabling SSD mode
btrfs: relocating block group 476315648 flags 9
btrfs: found 4 extents
attempt to access beyond end of device
sdb5: rw=145, want=546176, limit=546147
attempt to access beyond end of device
sdb5: rw=145, want=546304, limit=546147
attempt to access beyond end of device
sdb5: rw=145, want=546432, limit=546147
attempt to access beyond end of device
sdb5: rw=145, want=546560, limit=546147
attempt to access beyond end of device

Signed-off-by: Liu Bo
Signed-off-by: Chris Mason

liubo
2011-08-17 09:09:15 +0800
d5e2003c2 Btrfs: detect wether a device supports discard ... Browse Code »
1

We have a problem where if a user specifies discard but doesn't actually support
it we will return EOPNOTSUPP from btrfs_discard_extent. This is a problem
because this gets called (in a fashion) from the tree log recovery code, which
has a nice little BUG_ON(ret) after it, which causes us to fail the tree log
replay. So instead detect wether our devices support discard when we're adding
them and then don't issue discards if we know that the device doesn't support
it. And just for good measure set ret = 0 in btrfs_issue_discard just in case
we still get EOPNOTSUPP so we don't screw anybody up like this again. Thanks,

Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Josef Bacik
2011-08-17 09:09:15 +0800

06 Aug, 2011

1 commit

2ab1ba68a Btrfs: force unplugs when switching from high to regular priority bios ... Browse Code »

Btrfs does bio submissions from a worker thread, and each device
has a list of high priority bios and regular priority bios.

Synchronous writes go to the high priority thread while async writes
go to regular list. This commit brings back an explicit unplug
any time we switch from high to regular priority, which makes it
easier for the block layer to give us low latencies.

Signed-off-by: Chris Mason

Chris Mason
2011-08-06 01:48:18 +0800

02 Aug, 2011

1 commit

b43b31bdf Merge branch 'alloc_path' of git://git.kernel.org/pub/scm/linux/kernel/git/mfash… ... Browse Code »

…eh/btrfs-error-handling into for-linus

Chris Mason
2011-08-02 02:27:34 +0800

28 Jul, 2011

1 commit

85d4e4611 Btrfs: make a lockdep class for each root ... Browse Code »

This patch was originally from Tejun Heo. lockdep complains about the btrfs
locking because we sometimes take btree locks from two different trees at the
same time. The current classes are based only on level in the btree, which
isn't enough information for lockdep to figure out if the lock is safe.

This patch makes a class for each type of tree, and lumps all the FS trees that
actually have files and directories into the same class.

Signed-off-by: Chris Mason

Chris Mason
2011-07-28 00:46:46 +0800

26 Jul, 2011

1 commit

92b8e897f btrfs: Don't BUG_ON alloc_path errors in find_next_chunk ... Browse Code »

I also removed the BUG_ON from error return of find_next_chunk in
init_first_rw_device(). It turns out that the only caller of
init_first_rw_device() also BUGS on any nonzero return so no actual behavior
change has occurred here.

do_chunk_alloc() also needed an update since it calls btrfs_alloc_chunk()
which can now return -ENOMEM. Instead of setting space_info->full on any
error from btrfs_alloc_chunk() I catch and return every error value _except_
-ENOSPC. Thanks goes to Tsutomu Itoh for pointing that issue out.

Signed-off-by: Mark Fasheh

Mark Fasheh
2011-07-26 05:34:54 +0800

15 Jul, 2011

1 commit

17e9f796b btrfs: Don't BUG_ON alloc_path errors in btrfs_balance() ... Browse Code »

Dealing with this seems trivial - the only caller of btrfs_balance() is
btrfs_ioctl() which passes the error code directly back to userspace. There
also isn't much state to unwind (if I'm wrong about this point, we can
always safely move the allocation to the top of btrfs_balance() anyway).

Signed-off-by: Mark Fasheh

Mark Fasheh
2011-07-15 05:14:45 +0800

07 Jul, 2011

1 commit

508794eb5 Btrfs: don't panic if we get an error while balancing V2 ... Browse Code »

A user reported an error where if we try to balance an fs after a device has
been removed it will blow up. This is because we get an EIO back and this is
where BUG_ON(ret) bites us in the ass. To fix we just exit. Thanks,

Reported-by: Anand Jain
Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Josef Bacik
2011-07-07 06:46:43 +0800

11 Jun, 2011

1 commit

22b63a297 Btrfs - use %pU to print fsid ... Browse Code »

Get rid of FIXME comment. Uuids from dmesg are now the same as uuids
given by btrfs-progs.

Signed-off-by: Ilya Dryomov
Signed-off-by: Chris Mason

Ilya Dryomov
2011-06-11 07:02:04 +0800

04 Jun, 2011

1 commit

5f3f302a6 btrfs: false BUG_ON when degraded ... Browse Code »

In degraded mode the struct btrfs_device of missing devs don't have
device->name set. A kstrdup of NULL correctly returns NULL. Don't
BUG in this case.

Signed-off-by: Arne Jansen
Signed-off-by: Chris Mason

Arne Jansen
2011-06-04 20:03:44 +0800

24 May, 2011

8 commits

d6c0cb379 Merge branch 'cleanups_and_fixes' into inode_numbers ... Browse Code »

Conflicts:
fs/btrfs/tree-log.c
fs/btrfs/volumes.c

Signed-off-by: Chris Mason

Chris Mason
2011-05-24 02:37:47 +0800
1f78160ce Btrfs: using rcu lock in the reader side of devices list ... Browse Code »

fs_devices->devices is only updated on remove and add device paths, so we can
use rcu to protect it in the reader side

Signed-off-by: Xiao Guangrong
Signed-off-by: Chris Mason

Xiao Guangrong
2011-05-24 01:24:43 +0800
462247056 Btrfs: drop unnecessary device lock ... Browse Code »

Drop device_list_mutex for the reader side on clone_fs_devices and
btrfs_rm_device pathes since the fs_info->volume_mutex can ensure the device
list is not updated

btrfs_close_extra_devices is the initialized path, we can not add or remove
device at this time, so we can simply drop the mutex safely, like other
initialized function does(add_missing_dev, __find_device, __btrfs_open_devices
...).

Signed-off-by: Xiao Guangrong
Signed-off-by: Chris Mason

Xiao Guangrong
2011-05-24 01:24:43 +0800
0c1daee08 Btrfs: fix the race between remove dev and alloc chunk ... Browse Code »

On remove device path, it updates device->dev_alloc_list but does not hold
chunk lock

Signed-off-by: Xiao Guangrong
Signed-off-by: Chris Mason

Xiao Guangrong
2011-05-24 01:24:43 +0800
c9513edb0 Btrfs: fix the race between reading and updating devices ... Browse Code »

On btrfs_congested_fn and __unplug_io_fn paths, we should hold
device_list_mutex to avoid remove/add device path to
update fs_devices->devices

On __btrfs_close_devices and btrfs_prepare_sprout paths, the devices in
fs_devices->devices or fs_devices->devices is updated, so we should hold
the mutex to avoid the reader side to reach them

Signed-off-by: Xiao Guangrong
Signed-off-by: Chris Mason

Xiao Guangrong
2011-05-24 01:24:42 +0800
4f6c9328c Btrfs: fix bh leak on __btrfs_open_devices path ... Browse Code »

'bh' is forgot to release if no error is detected

Signed-off-by: Xiao Guangrong
Signed-off-by: Chris Mason

Xiao Guangrong
2011-05-24 01:24:42 +0800
65a246c5f Btrfs: return error code to caller when btrfs_del_item fails ... Browse Code »

The error code is returned instead of calling BUG_ON when
btrfs_del_item returns the error.

Signed-off-by: Tsutomu Itoh
Signed-off-by: Chris Mason

Tsutomu Itoh
2011-05-24 01:24:39 +0800
b0b802d7e Btrfs: return error code to caller when btrfs_previous_item fails ... Browse Code »

The error code is returned instead of calling BUG_ON when
btrfs_previous_item returns the error.

Signed-off-by: Tsutomu Itoh
Signed-off-by: Chris Mason

Tsutomu Itoh
2011-05-24 01:24:39 +0800

23 May, 2011

2 commits

712673339 Merge branch 'for-chris' of git://git.kernel.org/pub/scm/linux/kernel/git/arne/b… ... Browse Code »

…trfs-unstable-arne into inode_numbers

Conflicts:
fs/btrfs/Makefile
fs/btrfs/ctree.h
fs/btrfs/volumes.h

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Chris Mason
2011-05-23 18:30:52 +0800
aa2dfb372 Merge branch 'allocator' of git://git.kernel.org/pub/scm/linux/kernel/git/arne/b… ... Browse Code »

…trfs-unstable-arne into inode_numbers

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Chris Mason
2011-05-23 00:36:34 +0800

13 May, 2011

3 commits

73c5de005 btrfs: quasi-round-robin for chunk allocation ... Browse Code »

In a multi device setup, the chunk allocator currently always allocates
chunks on the devices in the same order. This leads to a very uneven
distribution, especially with RAID1 or RAID10 and an uneven number of
devices.
This patch always sorts the devices before allocating, and allocates the
stripes on the devices with the most available space, as long as there
is enough space available. In a low space situation, it first tries to
maximize striping.
The patch also simplifies the allocator and reduces the checks for
corner cases.
The simplification is done by several means. First, it defines the
properties of each RAID type upfront. These properties are used afterwards
instead of differentiating cases in several places.
Second, the old allocator defined a minimum stripe size for each block
group type, tried to find a large enough chunk, and if this fails just
allocates a smaller one. This is now done in one step. The largest possible
chunk (up to max_chunk_size) is searched and allocated.
Because we now have only one pass, the allocation of the map (struct
map_lookup) is moved down to the point where the number of stripes is
already known. This way we avoid reallocation of the map.
We still avoid allocating stripes that are not a multiple of STRIPE_SIZE.

Arne Jansen
2011-05-13 21:36:14 +0800
a9c9bf682 btrfs: heed alloc_start ... Browse Code »

currently alloc_start is disregarded if the requested
chunk size is bigger than (device size - alloc_start),
but smaller than the device size.
The only situation where I see this could have made sense
was when a chunk equal the size of the device has been
requested. This was possible as the allocator failed to
take alloc_start into account when calculating the request
chunk size. As this gets fixed by this patch, the workaround
is not necessary anymore.

Arne Jansen
2011-05-13 21:36:12 +0800
bcd53741c btrfs: move btrfs_cmp_device_free_bytes to super.c ... Browse Code »

this function won't be used here anymore, so move it super.c where it is
used for df-calculation

Arne Jansen
2011-05-13 21:36:05 +0800

12 May, 2011

1 commit

a2de733c7 btrfs: scrub ... Browse Code »

This adds an initial implementation for scrub. It works quite
straightforward. The usermode issues an ioctl for each device in the
fs. For each device, it enumerates the allocated device chunks. For
each chunk, the contained extents are enumerated and the data checksums
fetched. The extents are read sequentially and the checksums verified.
If an error occurs (checksum or EIO), a good copy is searched for. If
one is found, the bad copy will be rewritten.
All enumerations happen from the commit roots. During a transaction
commit, the scrubs get paused and afterwards continue from the new
roots.

This commit is based on the series originally posted to linux-btrfs
with some improvements that resulted from comments from David Sterba,
Ilya Dryomov and Jan Schmidt.

Signed-off-by: Arne Jansen

Arne Jansen
2011-05-12 20:45:20 +0800

06 May, 2011

1 commit

f2a97a9db btrfs: remove all unused functions ... Browse Code »

Remove static and global declarations and/or definitions. Reduces size
of btrfs.ko by ~3.4kB.

text data bss dec hex filename
402081 7464 200 409745 64091 btrfs.ko.base
398620 7144 200 405964 631cc btrfs.ko.remove-all

Signed-off-by: David Sterba

David Sterba
2011-05-06 18:34:03 +0800

02 May, 2011

2 commits

b3b4aa74b btrfs: drop unused parameter from btrfs_release_path ... Browse Code »

parameter tree root it's not used since commit
5f39d397dfbe140a14edecd4e73c34ce23c4f9ee ("Btrfs: Create extent_buffer
interface for large blocksizes")

Signed-off-by: David Sterba

David Sterba
2011-05-02 19:57:22 +0800
172ddd60a btrfs: drop gfp parameter from alloc_extent_map ... Browse Code »

pass GFP_NOFS directly to kmem_cache_alloc

Signed-off-by: David Sterba

David Sterba
2011-05-02 19:57:21 +0800