Eric Lee / smarc-fsl-linux-kernel

11 Jan, 2016

1 commit

b28cf5724 Merge branch 'misc-cleanups-4.5' of git://git.kernel.org/pub/scm/linux/kernel/gi… ... Browse Code »

…t/kdave/linux into for-linus-4.5

Signed-off-by: Chris Mason <clm@fb.com>

Chris Mason
2016-01-11 22:08:37 +0800

07 Jan, 2016

1 commit

ee22184b5 Btrfs: use linux/sizes.h to represent constants ... Browse Code »

We use many constants to represent size and offset value. And to make
code readable we use '256 * 1024 * 1024' instead of '268435456' to
represent '256MB'. However we can make far more readable with 'SZ_256MB'
which is defined in the 'linux/sizes.h'.

So this patch replaces 'xxx * 1024 * 1024' kind of expression with
single 'SZ_xxxMB' if 'xxx' is a power of 2 then 'xxx * SZ_1M' if 'xxx' is
not a power of 2. And I haven't touched to '4096' & '8192' because it's
more intuitive than 'SZ_4KB' & 'SZ_8KB'.

Signed-off-by: Byongho Lee
Signed-off-by: David Sterba

Byongho Lee
2016-01-07 21:38:02 +0800

24 Dec, 2015

1 commit

afa427cf9 Merge branch 'cleanup/misc-simplify' of git://git.kernel.org/pub/scm/linux/kerne… ... Browse Code »

…l/git/kdave/linux into for-linus-4.5

Chris Mason
2015-12-24 05:10:26 +0800

03 Dec, 2015

1 commit

87ad58c5f btrfs: make btrfs_close_one_device static ... Browse Code »

Signed-off-by: David Sterba

David Sterba
2015-12-03 22:02:21 +0800

25 Nov, 2015

1 commit

da02c6898 btrfs: fix clashing number of the enhanced balance usage filter ... Browse Code »

I've accidentally picked an already used number for the enhanced usage
filter represented by BTRFS_BALANCE_ARGS_USAGE_RANGE, clashing with
BTRFS_BALANCE_ARGS_CONVERT. Introduced during the development phase,
no backward compatibility issues.

Reported-by: Holger Hoffstätte
Reported-by: Dan Carpenter
Fixes: bc3094673f22 ("btrfs: extend balance filter usage to take minimum and maximum")
Signed-off-by: David Sterba
Signed-off-by: Chris Mason

David Sterba
2015-11-25 21:19:50 +0800

27 Oct, 2015

5 commits

b66d62ba1 btrfs: add balance filters limits, stripes and usage to supported mask ... Browse Code »

Enable the extended 'limit' syntax (a range), the new 'stripes' and
extended 'usage' syntax (a range) filters in the filters mask. The patch
comes separate and not within the series that introduced the new filters
because the patch adding the mask was merged in a late rc. The
integration branch was based on an older rc and could not merge the
patch due to the missing changes.

Prerequisities:
* btrfs: check unsupported filters in balance arguments
* btrfs: extend balance filter limit to take minimum and maximum
* btrfs: add balance filter for stripes
* btrfs: extend balance filter usage to take minimum and maximum

Signed-off-by: David Sterba
Signed-off-by: Chris Mason

David Sterba
2015-10-27 10:38:30 +0800
bc3094673 btrfs: extend balance filter usage to take minimum and maximum ... Browse Code »

Similar to the 'limit' filter, we can enhance the 'usage' filter to
accept a range. The change is backward compatible, the range is applied
only in connection with the BTRFS_BALANCE_ARGS_USAGE_RANGE flag.

We don't have a usecase yet, the current syntax has been sufficient. The
enhancement should provide parity with other range-like filters.

Signed-off-by: David Sterba
Signed-off-by: Chris Mason

David Sterba
2015-10-27 10:38:30 +0800
dee32d0ac btrfs: add balance filter for stripes ... Browse Code »

Balance block groups which have the given number of stripes, defined by
a range min..max. This is useful to selectively rebalance only chunks
that do not span enough devices, applies to RAID0/10/5/6.

Signed-off-by: Gabríel Arthúr Pétursson
[ renamed bargs members, added to the UAPI, wrote the changelog ]
Signed-off-by: David Sterba

Signed-off-by: Chris Mason

Gabríel Arthúr Pétursson
2015-10-27 10:38:29 +0800
12907fc79 btrfs: extend balance filter limit to take minimum and maximum ... Browse Code »

The 'limit' filter is underdesigned, it should have been a range for
[min,max], with some relaxed semantics when one of the bounds is
missing. Besides that, using a full u64 for a single value is a waste of
bytes.

Let's fix both by extending the use of the u64 bytes for the [min,max]
range. This can be done in a backward compatible way, the range will be
interpreted only if the appropriate flag is set
(BTRFS_BALANCE_ARGS_LIMIT_RANGE).

Signed-off-by: David Sterba
Signed-off-by: Chris Mason

David Sterba
2015-10-27 10:38:28 +0800
849ef9286 btrfs: check unsupported filters in balance arguments ... Browse Code »

We don't verify that all the balance filter arguments supplemented by
the flags are actually known to the kernel. Thus we let it silently pass
and do nothing.

At the moment this means only the 'limit' filter, but we're going to add
a few more soon so it's better to have that fixed. Also in older stable
kernels so that it works with newer userspace tools.

Cc: stable@vger.kernel.org # 3.16+
Signed-off-by: David Sterba
Signed-off-by: Chris Mason

David Sterba
2015-10-27 10:38:26 +0800

22 Oct, 2015

3 commits

a0d58e48d Merge branch 'cleanups/for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git… ... Browse Code »

…/kdave/linux into for-linus-4.4

Chris Mason
2015-10-22 09:21:40 +0800
8789f4fe6 btrfs: use btrfs_raid_array for btrfs_get_num_tolerated_disk_barrier_failures() ... Browse Code »

btrfs_raid_array[] is used to define all raid attributes, use it
to get tolerated_failures in btrfs_get_num_tolerated_disk_barrier_failures(),
instead of complex condition in function.

It can make code simple and auto-support other possible raid-type in
future.

Reviewed-by: David Sterba
Signed-off-by: Zhao Lei
Signed-off-by: David Sterba

Zhao Lei
2015-10-22 00:28:48 +0800
af9020475 btrfs: Move btrfs_raid_array to public ... Browse Code »

This array is used to record attributes of each raid type,
make it public, and many functions will benifit with this array.

For example, num_tolerated_disk_barrier_failures(), we can
avoid complex conditions in this function, and get raid attribute
simply by accessing above array.

It can also make code logic simple, reduce duplication code, and
increase maintainability.

Reviewed-by: David Sterba
Signed-off-by: Zhao Lei
Signed-off-by: David Sterba

Zhao Lei
2015-10-22 00:28:48 +0800

02 Oct, 2015

1 commit

f190aa471 Btrfs: add helper for closing one device ... Browse Code »

Signed-off-by: Anand Jain
[reworded subject and changelog]
Signed-off-by: David Sterba

Anand Jain
2015-10-02 00:00:05 +0800

01 Oct, 2015

1 commit

12b1c2637 Btrfs: enhance btrfs_scratch_superblock to scratch all superblocks ... Browse Code »

This patch updates and renames btrfs_scratch_superblocks, (which is used
by the replace device thread), with those fixes from the scratch
superblock code section of btrfs_rm_device(). The fixes are:
Scratch all copies of superblock
Notify kobject that superblock has been changed
Update time on the device

So that btrfs_rm_device() can use the function
btrfs_scratch_superblocks() instead of its own scratch code. And further
replace deivce code which similarly releases device back to the system,
will have the fixes from the btrfs device delete.

Signed-off-by: Anand Jain
[renamed to btrfs_scratch_superblock]
Signed-off-by: David Sterba

Anand Jain
2015-10-01 23:37:34 +0800

29 Sep, 2015

1 commit

c1b7e4745 Btrfs: rename super_kobj to fsid_kobj ... Browse Code »

Signed-off-by: Anand Jain
Signed-off-by: David Sterba

Anand Jain
2015-09-29 22:29:59 +0800

09 Aug, 2015

1 commit

46cd28555 Merge branch 'jeffm-discard-4.3' into for-linus-4.3 Browse Code »

Chris Mason
2015-08-09 22:35:33 +0800

29 Jul, 2015

1 commit

499f377f4 btrfs: iterate over unused chunk space in FITRIM ... Browse Code »

Since we now clean up block groups automatically as they become
empty, iterating over block groups is no longer sufficient to discard
unused space.

This patch iterates over the unused chunk space and discards any regions
that are unallocated, regardless of whether they were ever used. This is
a change for btrfs but is consistent with other file systems.

We do this in a transactionless manner since the discard process can take
a substantial amount of time and a transaction would need to be started
before the acquisition of the device list lock. That would mean a
transaction would be held open across /all/ of the discards collectively.
In order to prevent other threads from allocating or freeing chunks, we
hold the chunks lock across the search and discard calls. We release it
between searches to allow the file system to perform more-or-less
normally. Since the running transaction can commit and disappear while
we're using the transaction pointer, we take a reference to it and
release it after the search. This is safe since it would happen normally
at the end of the transaction commit after any locks are released anyway.
We also take the commit_root_sem to protect against a transaction starting
and committing while we're running.

Signed-off-by: Jeff Mahoney
Reviewed-by: Filipe Manana
Tested-by: Filipe Manana
Signed-off-by: Chris Mason

Jeff Mahoney
2015-07-29 23:15:26 +0800

01 Jul, 2015

1 commit

043cd0495 Merge branch 'for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull btrfs updates from Chris Mason:
"Outside of our usual batch of fixes, this integrates the subvolume
quota updates that Qu Wenruo from Fujitsu has been working on for a
few releases now. He gets an extra gold star for making btrfs smaller
this time, and fixing a number of quota corners in the process.

Dave Sterba tested and integrated Anand Jain's sysfs improvements.
Outside of exporting a symbol (ack'd by Greg) these are all internal
to btrfs and it's mostly cleanups and fixes. Anand also attached some
of our sysfs objects to our internal device management structs instead
of an object off the super block. It will make device management
easier overall and it's a better fit for how the sysfs files are used.
None of the existing sysfs files are moved around.

Thanks for all the fixes everyone"

* 'for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (87 commits)
btrfs: delayed-ref: double free in btrfs_add_delayed_tree_ref()
Btrfs: Check if kobject is initialized before put
lib: export symbol kobject_move()
Btrfs: sysfs: add support to show replacing target in the sysfs
Btrfs: free the stale device
Btrfs: use received_uuid of parent during send
Btrfs: fix use-after-free in btrfs_replay_log
btrfs: wait for delayed iputs on no space
btrfs: qgroup: Make snapshot accounting work with new extent-oriented qgroup.
btrfs: qgroup: Add the ability to skip given qgroup for old/new_roots.
btrfs: ulist: Add ulist_del() function.
btrfs: qgroup: Cleanup the old ref_node-oriented mechanism.
btrfs: qgroup: Switch self test to extent-oriented qgroup mechanism.
btrfs: qgroup: Switch to new extent-oriented qgroup mechanism.
btrfs: qgroup: Switch rescan to new mechanism.
btrfs: qgroup: Add new qgroup calculation function btrfs_qgroup_account_extents().
btrfs: backref: Add special time_seq == (u64)-1 case for btrfs_find_all_roots().
btrfs: qgroup: Add new function to record old_roots.
btrfs: qgroup: Record possible quota-related extent for qgroup.
btrfs: qgroup: Add function qgroup_update_counters().
...

Linus Torvalds
2015-07-01 11:07:45 +0800

27 May, 2015

3 commits

5a13f4308 Btrfs: sysfs: add pointer to access fs_info from fs_devices ... Browse Code »

adds fs_info pointer with struct btrfs_fs_devices.

Signed-off-by: Anand Jain
Signed-off-by: David Sterba

Anand Jain
2015-05-27 18:27:21 +0800
c73eccf75 Btrfs: introduce btrfs_get_fs_uuids to get fs_uuids ... Browse Code »

Signed-off-by: Anand Jain
Signed-off-by: David Sterba

Anand Jain
2015-05-27 18:27:20 +0800
2e7910d6c Btrfs: sysfs: move super_kobj and device_dir_kobj from fs_info to btrfs_fs_devices ... Browse Code »

This patch will provide a framework and help to create attributes
from the structure btrfs_fs_devices which are available even before
fs_info is created. So by moving the parent kobject super_kobj from
fs_info to btrfs_fs_devices, it will help to create attributes
from the btrfs_fs_devices as well.

Patches on top of this patch now will be able to create the
sys/fs/btrfs/fsid kobject and attributes from btrfs_fs_devices
when devices are scanned and registered to the kernel.

Just to note, this does not change any of the existing btrfs sysfs
external kobject names and its attributes and not even the life
cycle of them. Changes are internal only. And to ensure the same,
this path has been tested with various device operations and,
checking and comparing the sysfs kobjects and attributes with
sysfs kobject and attributes with out this patch, and they remain
same.

Signed-off-by: Anand Jain
Signed-off-by: David Sterba

Anand Jain
2015-05-27 18:27:20 +0800

22 May, 2015

1 commit

326e1dbb5 block: remove management of bi_remaining when restoring original bi_end_io ... Browse Code »

Commit c4cf5261 ("bio: skip atomic inc/dec of ->bi_remaining for
non-chains") regressed all existing callers that followed this pattern:
1) saving a bio's original bi_end_io
2) wiring up an intermediate bi_end_io
3) restoring the original bi_end_io from intermediate bi_end_io
4) calling bio_endio() to execute the restored original bi_end_io

The regression was due to BIO_CHAIN only ever getting set if
bio_inc_remaining() is called. For the above pattern it isn't set until
step 3 above (step 2 would've needed to establish BIO_CHAIN). As such
the first bio_endio(), in step 2 above, never decremented __bi_remaining
before calling the intermediate bi_end_io -- leaving __bi_remaining with
the value 1 instead of 0. When bio_inc_remaining() occurred during step
3 it brought it to a value of 2. When the second bio_endio() was
called, in step 4 above, it should've called the original bi_end_io but
it didn't because there was an extra reference that wasn't dropped (due
to atomic operations being optimized away since BIO_CHAIN wasn't set
upfront).

Fix this issue by removing the __bi_remaining management complexity for
all callers that use the above pattern -- bio_chain() is the only
interface that _needs_ to be concerned with __bi_remaining. For the
above pattern callers just expect the bi_end_io they set to get called!
Remove bio_endio_nodec() and also remove all bio_inc_remaining() calls
that aren't associated with the bio_chain() interface.

Also, the bio_inc_remaining() interface has been moved local to bio.c.

Fixes: c4cf5261 ("bio: skip atomic inc/dec of ->bi_remaining for non-chains")
Reviewed-by: Christoph Hellwig
Reviewed-by: Jan Kara
Signed-off-by: Mike Snitzer
Signed-off-by: Jens Axboe

Mike Snitzer
2015-05-22 22:58:55 +0800

17 Feb, 2015

1 commit

9eaed21ef btrfs: remove unused fs_info arg from btrfs_close_extra_devices() ... Browse Code »

The commit:
8dabb74 Btrfs: change core code of btrfs to support the
device replace operations
added the fs_info argument, but never used it -
just remove it again.

Signed-off-by: Eric Sandeen
Signed-off-by: David Sterba

Eric Sandeen
2015-02-17 01:48:45 +0800

22 Jan, 2015

3 commits

10f119001 Btrfs: Include map_type in raid_bio ... Browse Code »

Corrent code use many kinds of "clever" way to determine operation
target's raid type, as:
raid_map != NULL
or
raid_map[MAX_NR] == RAID[56]_Q_STRIPE

To make code easy to maintenance, this patch put raid type into
bbio, and we can always get raid type from bbio with a "stupid"
way.

Signed-off-by: Zhao Lei
Signed-off-by: Miao Xie
Signed-off-by: Chris Mason

Zhao Lei
2015-01-22 10:06:49 +0800
6e9606d2a Btrfs: add ref_count and free function for btrfs_bio ... Browse Code »

1: ref_count is simple than current RBIO_HOLD_BBIO_MAP_BIT flag
to keep btrfs_bio's memory in raid56 recovery implement.
2: free function for bbio will make code clean and flexible, plus
forced data type checking in compile.

Changelog v1->v2:
Rename following by David Sterba's suggestion:
put_btrfs_bio() -> btrfs_put_bio()
get_btrfs_bio() -> btrfs_get_bio()
bbio->ref_count -> bbio->refs

Signed-off-by: Zhao Lei
Signed-off-by: Miao Xie
Signed-off-by: Chris Mason

Zhao Lei
2015-01-22 10:06:48 +0800
8e5cfb55d Btrfs: Make raid_map array be inlined in btrfs_bio structure ... Browse Code »

It can make code more simple and clear, we need not care about
free bbio and raid_map together.

Signed-off-by: Miao Xie
Signed-off-by: Zhao Lei
Signed-off-by: Chris Mason

Zhao Lei
2015-01-22 10:06:47 +0800

03 Dec, 2014

4 commits

9627aeee3 Merge branch 'raid56-scrub-replace' of git://github.com/miaoxie/linux-btrfs into for-linus Browse Code »

Chris Mason
2014-12-03 10:42:03 +0800
04216820f Btrfs: fix race between fs trimming and block group remove/allocation ... Browse Code »

Our fs trim operation, which is completely transactionless (doesn't start
or joins an existing transaction) consists of visiting all block groups
and then for each one to iterate its free space entries and perform a
discard operation against the space range represented by the free space
entries. However before performing a discard, the corresponding free space
entry is removed from the free space rbtree, and when the discard completes
it is added back to the free space rbtree.

If a block group remove operation happens while the discard is ongoing (or
before it starts and after a free space entry is hidden), we end up not
waiting for the discard to complete, remove the extent map that maps
logical address to physical addresses and the corresponding chunk metadata
from the the chunk and device trees. After that and before the discard
completes, the current running transaction can finish and a new one start,
allowing for new block groups that map to the same physical addresses to
be allocated and written to.

So fix this by keeping the extent map in memory until the discard completes
so that the same physical addresses aren't reused before it completes.

If the physical locations that are under a discard operation end up being
used for a new metadata block group for example, and dirty metadata extents
are written before the discard finishes (the VM might call writepages() of
our btree inode's i_mapping for example, or an fsync log commit happens) we
end up overwriting metadata with zeroes, which leads to errors from fsck
like the following:

checking extents
Check tree block failed, want=833912832, have=0
Check tree block failed, want=833912832, have=0
Check tree block failed, want=833912832, have=0
Check tree block failed, want=833912832, have=0
Check tree block failed, want=833912832, have=0
read block failed check_tree_block
owner ref check failed [833912832 16384]
Errors found in extent allocation tree or chunk allocation
checking free space cache
checking fs roots
Check tree block failed, want=833912832, have=0
Check tree block failed, want=833912832, have=0
Check tree block failed, want=833912832, have=0
Check tree block failed, want=833912832, have=0
Check tree block failed, want=833912832, have=0
read block failed check_tree_block
root 5 root dir 256 error
root 5 inode 260 errors 2001, no inode item, link count wrong
unresolved ref dir 256 index 0 namelen 8 name foobar_3 filetype 1 errors 6, no dir index, no inode ref
root 5 inode 262 errors 2001, no inode item, link count wrong
unresolved ref dir 256 index 0 namelen 8 name foobar_5 filetype 1 errors 6, no dir index, no inode ref
root 5 inode 263 errors 2001, no inode item, link count wrong
(...)

Signed-off-by: Filipe Manana
Signed-off-by: Chris Mason

Filipe Manana
2014-12-03 10:35:09 +0800
2c8cdd6ee Btrfs, replace: write dirty pages into the replace target device ... Browse Code »

The implementation is simple:
- In order to avoid changing the code logic of btrfs_map_bio and
RAID56, we add the stripes of the replace target devices at the
end of the stripe array in btrfs bio, and we sort those target
device stripes in the array. And we keep the number of the target
device stripes in the btrfs bio.
- Except write operation on RAID56, all the other operation don't
take the target device stripes into account.
- When we do write operation, we read the data from the common devices
and calculate the parity. Then write the dirty data and new parity
out, at this time, we will find the relative replace target stripes
and wirte the relative data into it.

Note: The function that copying old data on the source device to
the target device was implemented in the past, it is similar to
the other RAID type.

Signed-off-by: Miao Xie

Miao Xie
2014-12-03 10:18:46 +0800
af8e2d1df Btrfs, scrub: repair the common data on RAID5/6 if it is corrupted ... Browse Code »

This patch implement the RAID5/6 common data repair function, the
implementation is similar to the scrub on the other RAID such as
RAID1, the differentia is that we don't read the data from the
mirror, we use the data repair function of RAID5/6.

Signed-off-by: Miao Xie

Miao Xie
2014-12-03 10:18:45 +0800

25 Nov, 2014

1 commit

084b6e7c7 btrfs: Fix a lockdep warning when running xfstest. ... Browse Code »

The following lockdep warning is triggered during xfstests:

[ 1702.980872] =========================================================
[ 1702.981181] [ INFO: possible irq lock inversion dependency detected ]
[ 1702.981482] 3.18.0-rc1 #27 Not tainted
[ 1702.981781] ---------------------------------------------------------
[ 1702.982095] kswapd0/77 just changed the state of lock:
[ 1702.982415] (&delayed_node->mutex){+.+.-.}, at: [] __btrfs_release_delayed_node+0x41/0x1f0 [btrfs]
[ 1702.982794] but this lock took another, RECLAIM_FS-unsafe lock in the past:
[ 1702.983160] (&fs_info->dev_replace.lock){+.+.+.}

and interrupts could create inverse lock ordering between them.

[ 1702.984675]
other info that might help us debug this:
[ 1702.985524] Chain exists of:
&delayed_node->mutex --> &found->groups_sem --> &fs_info->dev_replace.lock

[ 1702.986799] Possible interrupt unsafe locking scenario:

[ 1702.987681] CPU0 CPU1
[ 1702.988137] ---- ----
[ 1702.988598] lock(&fs_info->dev_replace.lock);
[ 1702.989069] local_irq_disable();
[ 1702.989534] lock(&delayed_node->mutex);
[ 1702.990038] lock(&found->groups_sem);
[ 1702.990494]
[ 1702.990938] lock(&delayed_node->mutex);
[ 1702.991407]
*** DEADLOCK ***

It is because the btrfs_kobj_{add/rm}_device() will call memory
allocation with GFP_KERNEL,
which may flush fs page cache to free space, waiting for it self to do
the commit, causing the deadlock.

To solve the problem, move btrfs_kobj_{add/rm}_device() out of the
dev_replace lock range, also involing split the
btrfs_rm_dev_replace_srcdev() function into remove and free parts.

Now only btrfs_rm_dev_replace_remove_srcdev() is called in dev_replace
lock range, and kobj_{add/rm} and btrfs_rm_dev_replace_free_srcdev() are
called out of the lock range.

Signed-off-by: Qu Wenruo
Signed-off-by: Chris Mason

Qu Wenruo
2014-11-25 21:55:38 +0800

23 Sep, 2014

1 commit

47ab2a6c6 Btrfs: remove empty block groups automatically ... Browse Code »

One problem that has plagued us is that a user will use up all of his space with
data, remove a bunch of that data, and then try to create a bunch of small files
and run out of space. This happens because all the chunks were allocated for
data since the metadata requirements were so low. But now there's a bunch of
empty data block groups and not enough metadata space to do anything. This
patch solves this problem by automatically deleting empty block groups. If we
notice the used count go down to 0 when deleting or on mount notice that a block
group has a used count of 0 then we will queue it to be deleted.

When the cleaner thread runs we will double check to make sure the block group
is still empty and then we will delete it. This patch has the side effect of no
longer having a bunch of BUG_ON()'s in the chunk delete code, which will be
helpful for both this and relocate. Thanks,

Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Josef Bacik
2014-09-23 08:13:21 +0800

18 Sep, 2014

7 commits

c1dc08967 Btrfs: do file data check by sub-bio's self ... Browse Code »

Direct IO splits the original bio to several sub-bios because of the limit of
raid stripe, and the filesystem will wait for all sub-bios and then run final
end io process.

But it was very hard to implement the data repair when dio read failure happens,
because at the final end io function, we didn't know which mirror the data was
read from. So in order to implement the data repair, we have to move the file data
check in the final end io function to the sub-bio end io function, in which we can
get the mirror number of the device we access. This patch did this work as the
first step of the direct io data repair implementation.

Signed-off-by: Miao Xie
Signed-off-by: Chris Mason

Miao Xie
2014-09-18 04:38:53 +0800
67a2c45ee Btrfs: fix use-after-free problem of the device during device replace ... Browse Code »

The problem is:
Task0(device scan task) Task1(device replace task)
scan_one_device()
mutex_lock(&uuid_mutex)
device = find_device()
mutex_lock(&device_list_mutex)
lock_chunk()
rm_and_free_source_device
unlock_chunk()
mutex_unlock(&device_list_mutex)
check device

Destroying the target device if device replace fails also has the same problem.

We fix this problem by locking uuid_mutex during destroying source device or
target device, just like the device remove operation.

It is a temporary solution, we can fix this problem and make the code more
clear by atomic counter in the future.

Signed-off-by: Miao Xie
Signed-off-by: Chris Mason

Miao Xie
2014-09-18 04:38:44 +0800
7cc8e58d5 Btrfs: fix unprotected device's variants on 32bits machine ... Browse Code »

->total_bytes,->disk_total_bytes,->bytes_used is protected by chunk
lock when we change them, but sometimes we read them without any lock,
and we might get unexpected value. We fix this problem like inode's
i_size.

Signed-off-by: Miao Xie
Signed-off-by: Chris Mason

Miao Xie
2014-09-18 04:38:38 +0800
ce7213c70 Btrfs: fix wrong device bytes_used in the super block ... Browse Code »

device->bytes_used will be changed when allocating a new chunk, and
disk_total_size will be changed if resizing is successful.
Meanwhile, the on-disk super blocks of the previous transaction
might not be updated. Considering the consistency of the metadata
in the previous transaction, We should use the size in the previous
transaction to check if the super block is beyond the boundary
of the device.

Though it is not big problem because we don't use it now, but anyway
it is better that we make it be consistent with the common metadata,
maybe we will use it in the future.

Signed-off-by: Miao Xie
Signed-off-by: Chris Mason

Miao Xie
2014-09-18 04:38:34 +0800
935e5cc93 Btrfs: fix wrong disk size when writing super blocks ... Browse Code »

total_size will be changed when resizing a device, and disk_total_size
will be changed if resizing is successful. Meanwhile, the on-disk super
blocks of the previous transaction might not be updated. Considering
the consistency of the metadata in the previous transaction, We should
use the size in the previous transaction to check if the super block is
beyond the boundary of the device. Fix it.

Signed-off-by: Miao Xie
Signed-off-by: Chris Mason

Miao Xie
2014-09-18 04:38:33 +0800
1c43366d3 Btrfs: fix unprotected assignment of the target device ... Browse Code »

We didn't protect the assignment of the target device, it might cause the
problem that the super block update was skipped because we might find wrong
size of the target device during the assignment. Fix it by moving the
assignment sentences into the initialization function of the target device.
And there is another merit that we can check if the target device is suitable
more early.

Signed-off-by: Miao Xie
Signed-off-by: Chris Mason

Miao Xie
2014-09-18 04:38:31 +0800
90180da42 Btrfs: cleanup unused num_can_discard in fs_devices ... Browse Code »

The member variants - num_can_discard - of fs_devices structure
are set, but no one use them to do anything. so remove them.

Signed-off-by: Miao Xie
Signed-off-by: Chris Mason

Miao Xie
2014-09-18 04:38:29 +0800