Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

01 Jul, 2014

1 commit

eed31172a Btrfs: fix scrub_print_warning to handle skinny metadata extents ... Browse Code »

commit 6eda71d0c030af0fc2f68aaa676e6d445600855b upstream.

The skinny extents are intepreted incorrectly in scrub_print_warning(),
and end up hitting the BUG() in btrfs_extent_inline_ref_size.

Reported-by: Konstantinos Skarlatos
Signed-off-by: Liu Bo
Signed-off-by: Chris Mason
Signed-off-by: Greg Kroah-Hartman

Liu Bo
2014-07-01 11:14:03 +0800

12 Apr, 2014

1 commit

3123bca71 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull second set of btrfs updates from Chris Mason:
"The most important changes here are from Josef, fixing a btrfs
regression in 3.14 that can cause corruptions in the extent allocation
tree when snapshots are in use.

Josef also fixed some deadlocks in send/recv and other assorted races
when balance is running"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (23 commits)
Btrfs: fix compile warnings on on avr32 platform
btrfs: allow mounting btrfs subvolumes with different ro/rw options
btrfs: export global block reserve size as space_info
btrfs: fix crash in remount(thread_pool=) case
Btrfs: abort the transaction when we don't find our extent ref
Btrfs: fix EINVAL checks in btrfs_clone
Btrfs: fix unlock in __start_delalloc_inodes()
Btrfs: scrub raid56 stripes in the right way
Btrfs: don't compress for a small write
Btrfs: more efficient io tree navigation on wait_extent_bit
Btrfs: send, build path string only once in send_hole
btrfs: filter invalid arg for btrfs resize
Btrfs: send, fix data corruption due to incorrect hole detection
Btrfs: kmalloc() doesn't return an ERR_PTR
Btrfs: fix snapshot vs nocow writting
btrfs: Change the expanding write sequence to fix snapshot related bug.
btrfs: make device scan less noisy
btrfs: fix lockdep warning with reclaim lock inversion
Btrfs: hold the commit_root_sem when getting the commit root during send
Btrfs: remove transaction from send
...

Linus Torvalds
2014-04-12 05:16:53 +0800

11 Apr, 2014

1 commit

e4fbaee29 Btrfs: fix compile warnings on on avr32 platform ... Browse Code »

fs/btrfs/scrub.c: In function 'get_raid56_logic_offset':
fs/btrfs/scrub.c:2269: warning: comparison of distinct pointer types lacks a cast
fs/btrfs/scrub.c:2269: warning: right shift count >= width of type
fs/btrfs/scrub.c:2269: warning: passing argument 1 of '__div64_32' from incompatible pointer type

Since @rot is an int type, we should not use do_div(), fix it.

Reported-by: kbuild test robot
Signed-off-by: Wang Shilong
Signed-off-by: Chris Mason

Wang Shilong
2014-04-11 21:35:50 +0800

08 Apr, 2014

1 commit

3b080b256 Btrfs: scrub raid56 stripes in the right way ... Browse Code »

Steps to reproduce:
# mkfs.btrfs -f /dev/sda[8-11] -m raid5 -d raid5
# mount /dev/sda8 /mnt
# btrfs scrub start -BR /mnt
# echo $?
Signed-off-by: Chris Mason

Wang Shilong
2014-04-08 00:08:49 +0800

05 Apr, 2014

1 commit

53c566625 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull btrfs changes from Chris Mason:
"This is a pretty long stream of bug fixes and performance fixes.

Qu Wenruo has replaced the btrfs async threads with regular kernel
workqueues. We'll keep an eye out for performance differences, but
it's nice to be using more generic code for this.

We still have some corruption fixes and other patches coming in for
the merge window, but this batch is tested and ready to go"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (108 commits)
Btrfs: fix a crash of clone with inline extents's split
btrfs: fix uninit variable warning
Btrfs: take into account total references when doing backref lookup
Btrfs: part 2, fix incremental send's decision to delay a dir move/rename
Btrfs: fix incremental send's decision to delay a dir move/rename
Btrfs: remove unnecessary inode generation lookup in send
Btrfs: fix race when updating existing ref head
btrfs: Add trace for btrfs_workqueue alloc/destroy
Btrfs: less fs tree lock contention when using autodefrag
Btrfs: return EPERM when deleting a default subvolume
Btrfs: add missing kfree in btrfs_destroy_workqueue
Btrfs: cache extent states in defrag code path
Btrfs: fix deadlock with nested trans handles
Btrfs: fix possible empty list access when flushing the delalloc inodes
Btrfs: split the global ordered extents mutex
Btrfs: don't flush all delalloc inodes when we doesn't get s_umount lock
Btrfs: reclaim delalloc metadata more aggressively
Btrfs: remove unnecessary lock in may_commit_transaction()
Btrfs: remove the unnecessary flush when preparing the pages
Btrfs: just do dirty page flush for the inode with compression before direct IO
...

Linus Torvalds
2014-04-05 06:31:36 +0800

11 Mar, 2014

4 commits

d458b0540 btrfs: Cleanup the "_struct" suffix in btrfs_workequeue ... Browse Code »

Since the "_struct" suffix is mainly used for distinguish the differnt
btrfs_work between the original and the newly created one,
there is no need using the suffix since all btrfs_workers are changed
into btrfs_workqueue.

Also this patch fixed some codes whose code style is changed due to the
too long "_struct" suffix.

Signed-off-by: Qu Wenruo
Tested-by: David Sterba
Signed-off-by: Josef Bacik

Qu Wenruo
2014-03-11 03:17:16 +0800
0339ef2f4 btrfs: Replace fs_info->scrub_* workqueue with btrfs_workqueue. ... Browse Code »

Replace the fs_info->scrub_* with the newly created
btrfs_workqueue.

Signed-off-by: Qu Wenruo
Tested-by: David Sterba
Signed-off-by: Josef Bacik

Qu Wenruo
2014-03-11 03:17:14 +0800
32a447896 Btrfs: wake up @scrub_pause_wait as much as we can ... Browse Code »

check if @scrubs_running=@scrubs_paused condition inside wait_event()
is not an atomic operation which means we may inc/dec @scrub_running/
paused at any time. Let's wake up @scrub_pause_wait as much as we can
to let commit transaction blocked less.

An example below:

Thread1 Thread2
|->scrub_blocked_if_needed() |->scrub_pending_trans_workers_inc
|->increase @scrub_paused
|->increase @scrub_running
|->wake up scrub_pause_wait list
|->scrub blocked
|->increase @scrub_paused

Thread3 is commiting transaction which is blocked at btrfs_scrub_pause().
So after Thread2 increase @scrub_paused, we meet the condition
@scrub_paused=@scrub_running, but transaction will be still blocked until
another calling to wake up @scrub_pause_wait.

Signed-off-by: Wang Shilong
Signed-off-by: Miao Xie
Signed-off-by: Josef Bacik

Wang Shilong
2014-03-11 03:16:54 +0800
12cf93728 Btrfs: device_replace: fix deadlock for nocow case ... Browse Code »

commit cb7ab02156e4 cause a following deadlock found by
xfstests,btrfs/011:

Thread1 is commiting transaction which is blocked at
btrfs_scrub_pause().

Thread2 is calling btrfs_file_aio_write() which has held
inode's @i_mutex and commit transaction(blocked because
Thread1 is committing transaction).

Thread3 is copy_nocow_page worker which will also try to
hold inode @i_mutex, so thread3 will wait Thread1 finished.

Thread4 is waiting pending workers finished which will wait
Thread3 finished. So the problem is like this:

Thread1--->Thread4--->Thread3--->Thread2---->Thread1

Deadlock happens! we fix it by letting Thread1 go firstly,
which means we won't block transaction commit while we are
waiting pending workers finished.

Reported-by: Qu Wenruo
Signed-off-by: Wang Shilong
Signed-off-by: Josef Bacik

Wang Shilong
2014-03-11 03:16:53 +0800

31 Jan, 2014

1 commit

e7651b819 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull btrfs updates from Chris Mason:
"This is a pretty big pull, and most of these changes have been
floating in btrfs-next for a long time. Filipe's properties work is a
cool building block for inheriting attributes like compression down on
a per inode basis.

Jeff Mahoney kicked in code to export filesystem info into sysfs.

Otherwise, lots of performance improvements, cleanups and bug fixes.

Looks like there are still a few other small pending incrementals, but
I wanted to get the bulk of this in first"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (149 commits)
Btrfs: fix spin_unlock in check_ref_cleanup
Btrfs: setup inode location during btrfs_init_inode_locked
Btrfs: don't use ram_bytes for uncompressed inline items
Btrfs: fix btrfs_search_slot_for_read backwards iteration
Btrfs: do not export ulist functions
Btrfs: rework ulist with list+rb_tree
Btrfs: fix memory leaks on walking backrefs failure
Btrfs: fix send file hole detection leading to data corruption
Btrfs: add a reschedule point in btrfs_find_all_roots()
Btrfs: make send's file extent item search more efficient
Btrfs: fix to catch all errors when resolving indirect ref
Btrfs: fix protection between walking backrefs and root deletion
btrfs: fix warning while merging two adjacent extents
Btrfs: fix infinite path build loops in incremental send
btrfs: undo sysfs when open_ctree() fails
Btrfs: fix snprintf usage by send's gen_unique_name
btrfs: fix defrag 32-bit integer overflow
btrfs: sysfs: list the NO_HOLES feature
btrfs: sysfs: don't show reserved incompat feature
btrfs: call permission checks earlier in ioctls and return EPERM
...

Linus Torvalds
2014-01-31 12:08:20 +0800

29 Jan, 2014

6 commits

ade2e0b3e Btrfs: fix to search previous metadata extent item since skinny metadata ... Browse Code »

There is a bug that using btrfs_previous_item() to search metadata extent item.
This is because in btrfs_previous_item(), we need type match, however, since
skinny metada was introduced by josef, we may mix this two types. So just
use btrfs_previous_item() is not working right.

To keep btrfs_previous_item() like normal tree search, i introduce another
function btrfs_previous_extent_item().

Signed-off-by: Wang Shilong
Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Wang Shilong
2014-01-29 05:20:33 +0800
7c76edb77 Btrfs: fix missing skinny metadata check in scrub_stripe() ... Browse Code »

Check if we support skinny metadata firstly and fix to use
right type to search.

Signed-off-by: Wang Shilong
Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Wang Shilong
2014-01-29 05:20:32 +0800
efe120a06 Btrfs: convert printk to btrfs_ and fix BTRFS prefix ... Browse Code »

Convert all applicable cases of printk and pr_* to the btrfs_* macros.

Fix all uses of the BTRFS prefix.

Signed-off-by: Frank Holton
Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Frank Holton
2014-01-29 05:20:05 +0800
cb7ab0215 Btrfs: wrap repeated code into scrub_blocked_if_needed() ... Browse Code »

Just wrap same code into one function scrub_blocked_if_needed().

This make a change that we will move waiting (@workers_pending = 0)
before we can wake up commiting transaction(atomic_inc(@scrub_paused)),
we must take carefully to not deadlock here.

Thread 1 Thread 2
|->btrfs_commit_transaction()
|->set trans type(COMMIT_DOING)
|->btrfs_scrub_paused()(blocked)
|->join_transaction(blocked)

Move btrfs_scrub_paused() before setting trans type which means we can
still join a transaction when commiting_transaction is blocked.

Signed-off-by: Wang Shilong
Suggested-by: Miao Xie
Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Wang Shilong
2014-01-29 05:19:53 +0800
3cb0929ad Btrfs: fix wrong super generation mismatch when scrubbing supers ... Browse Code »

We came a race condition when scrubbing superblocks, the story is:

In commiting transaction, we will update @last_trans_commited after
writting superblocks, if scrubber start after writting superblocks
and before updating @last_trans_commited, generation mismatch happens!

We fix this by checking @scrub_pause_req, and we won't start a srubber
until commiting transaction is finished.(after btrfs_scrub_continue()
finished.)

Reported-by: Sebastian Ochmann
Signed-off-by: Wang Shilong
Reviewed-by: Miao Xie
Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Wang Shilong
2014-01-29 05:19:52 +0800
ce3e7f107 btrfs: remove unused variable from scrub_fixup_nodatasum ... Browse Code »

Signed-off-by: Valentina Giusti
Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Valentina Giusti
2014-01-29 05:19:34 +0800

06 Dec, 2013

1 commit

5ee540613 Merge branch 'for-linus' of git://git.kernel.dk/linux-block ... Browse Code »

Pull block layer fixes from Jens Axboe:
"A small collection of fixes for the current series. It contains:

- A fix for a use-after-free of a request in blk-mq. From Ming Lei

- A fix for a blk-mq bug that could attempt to dereference a NULL rq
if allocation failed

- Two xen-blkfront small fixes

- Cleanup of submit_bio_wait() type uses in the kernel, unifying
that. From Kent

- A fix for 32-bit blkg_rwstat reading. I apologize for this one
looking mangled in the shortlog, it's entirely my fault for missing
an empty line between the description and body of the text"

* 'for-linus' of git://git.kernel.dk/linux-block:
blk-mq: fix use-after-free of request
blk-mq: fix dereference of rq->mq_ctx if allocation fails
block: xen-blkfront: Fix possible NULL ptr dereference
xen-blkfront: Silence pfn maybe-uninitialized warning
block: submit_bio_wait() conversions
Update of blkg_stat and blkg_rwstat may happen in bh context

Linus Torvalds
2013-12-06 07:33:27 +0800

25 Nov, 2013

1 commit

c170bbb45 block: submit_bio_wait() conversions ... Browse Code »

It was being open coded in a few places.

Signed-off-by: Kent Overstreet
Cc: Jens Axboe
Cc: Joern Engel
Cc: Prasad Joshi
Cc: Neil Brown
Cc: Chris Mason
Acked-by: NeilBrown
Signed-off-by: Jens Axboe

Kent Overstreet
2013-11-25 07:33:41 +0800

24 Nov, 2013

2 commits

4f024f379 block: Abstract out bvec iterator ... Browse Code »
13

Immutable biovecs are going to require an explicit iterator. To
implement immutable bvecs, a later patch is going to add a bi_bvec_done
member to this struct; for now, this patch effectively just renames
things.

Signed-off-by: Kent Overstreet
Cc: Jens Axboe
Cc: Geert Uytterhoeven
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: "Ed L. Cashin"
Cc: Nick Piggin
Cc: Lars Ellenberg
Cc: Jiri Kosina
Cc: Matthew Wilcox
Cc: Geoff Levand
Cc: Yehuda Sadeh
Cc: Sage Weil
Cc: Alex Elder
Cc: ceph-devel@vger.kernel.org
Cc: Joshua Morris
Cc: Philip Kelleher
Cc: Rusty Russell
Cc: "Michael S. Tsirkin"
Cc: Konrad Rzeszutek Wilk
Cc: Jeremy Fitzhardinge
Cc: Neil Brown
Cc: Alasdair Kergon
Cc: Mike Snitzer
Cc: dm-devel@redhat.com
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Cc: linux390@de.ibm.com
Cc: Boaz Harrosh
Cc: Benny Halevy
Cc: "James E.J. Bottomley"
Cc: Greg Kroah-Hartman
Cc: "Nicholas A. Bellinger"
Cc: Alexander Viro
Cc: Chris Mason
Cc: "Theodore Ts'o"
Cc: Andreas Dilger
Cc: Jaegeuk Kim
Cc: Steven Whitehouse
Cc: Dave Kleikamp
Cc: Joern Engel
Cc: Prasad Joshi
Cc: Trond Myklebust
Cc: KONISHI Ryusuke
Cc: Mark Fasheh
Cc: Joel Becker
Cc: Ben Myers
Cc: xfs@oss.sgi.com
Cc: Steven Rostedt
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Len Brown
Cc: Pavel Machek
Cc: "Rafael J. Wysocki"
Cc: Herton Ronaldo Krzesinski
Cc: Ben Hutchings
Cc: Andrew Morton
Cc: Guo Chao
Cc: Tejun Heo
Cc: Asai Thambi S P
Cc: Selvan Mani
Cc: Sam Bradshaw
Cc: Wei Yongjun
Cc: "Roger Pau Monné"
Cc: Jan Beulich
Cc: Stefano Stabellini
Cc: Ian Campbell
Cc: Sebastian Ott
Cc: Christian Borntraeger
Cc: Minchan Kim
Cc: Jiang Liu
Cc: Nitin Gupta
Cc: Jerome Marchand
Cc: Joe Perches
Cc: Peng Tao
Cc: Andy Adamson
Cc: fanchaoting
Cc: Jie Liu
Cc: Sunil Mushran
Cc: "Martin K. Petersen"
Cc: Namjae Jeon
Cc: Pankaj Kumar
Cc: Dan Magenheimer
Cc: Mel Gorman 6

Kent Overstreet
2013-11-24 14:33:47 +0800
33879d451 block: submit_bio_wait() conversions ... Browse Code »

It was being open coded in a few places.

Signed-off-by: Kent Overstreet
Cc: Jens Axboe
Cc: Joern Engel
Cc: Prasad Joshi
Cc: Neil Brown
Cc: Chris Mason
Acked-by: NeilBrown

Kent Overstreet
2013-11-24 14:33:38 +0800

21 Nov, 2013

1 commit

33ef30add Btrfs: do not inc uncorrectable_errors counter on ro scrubs ... Browse Code »

Currently if we discover an error when scrubbing in ro mode we a)
blindly increment the uncorrectable_errors counter, and b) spam the
dmesg with the 'unable to fixup (regular) error at ...' message, even
though a) we haven't tried to determine if the error is correctable or
not, and b) we haven't tried to fixup anything. Fix this.

Cc: Stefan Behrens
Signed-off-by: Ilya Dryomov
Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Ilya Dryomov
2013-11-21 09:41:38 +0800

12 Nov, 2013

3 commits

3b7a016f4 Btrfs: avoid unnecessary scrub workers allocation ... Browse Code »

We only allocate scrub workers if we pass all the necessary
checks, for example, there are no operation in progress.

Besides, move mutex lock protection outside of scrub_workers_get()
/scrub_workers_put().

Signed-off-by: Wang Shilong
Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Wang Shilong
2013-11-12 11:12:58 +0800
9b011adfe Btrfs: remove scrub_super_lock holding in btrfs_sync_log() ... Browse Code »

Originally, we introduced scrub_super_lock to synchronize
tree log code with scrubbing super.

However we can replace scrub_super_lock with device_list_mutex,
because writing super will hold this mutex, this will reduce an extra
lock holding when writing supers in sync log code.

Signed-off-by: Wang Shilong
Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Wang Shilong
2013-11-12 11:10:13 +0800
539f358a3 Btrfs: fix the dev-replace suspend sequence ... Browse Code »

Replace progresses strictly from lower to higher offsets, and the
progress is tracked in chunks, by storing the physical offset of the
dev_extent which is being copied in the cursor_left field of
btrfs_dev_replace_item. When we are done copying the chunk,
left_cursor is updated to point one byte past the dev_extent, so that
on resume we can skip the dev_extents that have already been copied.

There is a major bug (which goes all the way back to the inception of
dev-replace in 3.8) in the way left_cursor is bumped: the bump is done
unconditionally, without any regard to the scrub_chunk return value.
On suspend (and also on any kind of error) scrub_chunk returns early,
i.e. without completing the copy. This leads to us skipping the chunk
that hasn't been fully copied yet when resuming.

Fix this by doing the cursor_left update only if scrub_chunk ret is 0.
(On suspend scrub_chunk returns with -ECANCELED, so this fix covers
both suspend and error cases.)

Cc: Stefan Behrens
Signed-off-by: Ilya Dryomov
Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Ilya Dryomov
2013-11-12 10:55:36 +0800

21 Sep, 2013

1 commit

652f25a29 Btrfs: improve replacing nocow extents ... Browse Code »

Various people have hit a deadlock when running btrfs/011. This is because when
replacing nocow extents we will take the i_mutex to make sure nobody messes with
the file while we are replacing the extent. The problem is we are already
holding a transaction open, which is a locking inversion, so instead we need to
save these inodes we find and then process them outside of the transaction.

Further we can't just lock the inode and assume we are good to go. We need to
lock the extent range and then read back the extent cache for the inode to make
sure the extent really still points at the physical block we want. If it
doesn't we don't have to copy it. Thanks,

Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Josef Bacik
2013-09-21 23:05:26 +0800

01 Sep, 2013

5 commits

23fa76b0b Btrf: cleanup: don't check for root_refs == 0 twice ... Browse Code »

btrfs_read_fs_root_no_name() already checks if btrfs_root_refs()
is zero and returns ENOENT in this case. There is no need to do
it again in three more places.

Signed-off-by: Stefan Behrens
Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Stefan Behrens
2013-09-01 20:16:29 +0800
118a0a251 Btrfs: Format mirror_num as int ... Browse Code »

mirror_num is always "int", hence don't cast it to "unsigned long long" and
format it as a 64-bit number.

Signed-off-by: Geert Uytterhoeven
Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Geert Uytterhoeven
2013-09-01 20:16:11 +0800
27f9f0235 Btrfs: Format PAGE_SIZE as unsigned long ... Browse Code »

PAGE_SIZE is "unsigned long" everywhere, so there's no need to cast it to
"unsigned long long" and format it as a 64-bit number.

Signed-off-by: Geert Uytterhoeven
Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Geert Uytterhoeven
2013-09-01 20:16:10 +0800
c1c9ff7c9 Btrfs: Remove superfluous casts from u64 to unsigned long long ... Browse Code »

u64 is "unsigned long long" on all architectures now, so there's no need to
cast it when formatting it using the "ll" length modifier.

Signed-off-by: Geert Uytterhoeven
Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Geert Uytterhoeven
2013-09-01 20:16:08 +0800
3cae210fa btrfs: Cleanup for using BTRFS_SETGET_STACK instead of raw convert ... Browse Code »

Some codes still use the cpu_to_lexx instead of the
BTRFS_SETGET_STACK_FUNCS declared in ctree.h.

Also added some BTRFS_SETGET_STACK_FUNCS for btrfs_header btrfs_timespec
and other structures.

Signed-off-by: Qu Wenruo
Reviewed-by: Miao Xie
Reviewed-by: David Sterba
Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Qu Wenruo
2013-09-01 19:57:37 +0800

20 Jul, 2013

1 commit

115930cb2 Btrfs: fix wrong write offset when replacing a device ... Browse Code »

Miao Xie reported the following issue:

The filesystem was corrupted after we did a device replace.

Steps to reproduce:
# mkfs.btrfs -f -m single -d raid10 ..
# mount
# btrfs replace start -rfB 1
# umount
# btrfsck

The reason for the issue is that we changed the write offset by mistake,
introduced by commit 625f1c8dc.

We read the data from the source device at first, and then write the
data into the corresponding place of the new device. In order to
implement the "-r" option, the source location is remapped using
btrfs_map_block(). The read takes place on the mapped location, and
the write needs to take place on the unmapped location. Currently
the write is using the mapped location, and this commit changes it
back by undoing the change to the write address that the aforementioned
commit added by mistake.

Reported-by: Miao Xie
Cc: # 3.10+
Signed-off-by: Stefan Behrens
Signed-off-by: Josef Bacik

Stefan Behrens
2013-07-20 03:07:26 +0800

02 Jul, 2013

4 commits

edd1400be Btrfs: fix several potential problems in copy_nocow_pages_for_inode ... Browse Code »

- It makes no sense that we deal with a inode in the dead tree.
- fix the race between dio and page copy by waiting the dio completion
- avoid the page copy vs truncate/punch hole
- check if the page is in the page cache or not

Signed-off-by: Miao Xie
Signed-off-by: Josef Bacik

Miao Xie
2013-07-02 23:50:58 +0800
826aa0a82 Btrfs: cleanup the code of copy_nocow_pages_for_inode() ... Browse Code »

- It make no sense that we continue to do something after the error
happened, just go back with this patch.
- remove some check of copy_nocow_pages_for_inode(), such as page check
after write, inode check in the end of the function, because we are
sure they exist.
- remove the unnecessary goto in the return value check of the write

Signed-off-by: Miao Xie
Signed-off-by: Josef Bacik

Miao Xie
2013-07-02 23:50:56 +0800
26b258919 Btrfs: fix oops when recovering the file data by scrub function ... Browse Code »

We get oops while running btrfs replace start test,
------------[ cut here ]------------
kernel BUG at mm/filemap.c:608!
[SNIP]
Call Trace:
[] copy_nocow_pages_for_inode+0x217/0x3f0 [btrfs]
[] ? scrub_print_warning_inode+0x230/0x230 [btrfs]
[] ? scrub_print_warning_inode+0x230/0x230 [btrfs]
[] iterate_extent_inodes+0x1ae/0x300 [btrfs]
[] iterate_inodes_from_logical+0x92/0xb0 [btrfs]
[] ? scrub_print_warning_inode+0x230/0x230 [btrfs]
[] copy_nocow_pages_worker+0x97/0x150 [btrfs]
[] worker_loop+0x134/0x540 [btrfs]
[] ? __schedule+0x3ca/0x7f0
[] ? btrfs_queue_worker+0x300/0x300 [btrfs]
[] kthread+0xc0/0xd0
[] ? flush_kthread_worker+0x80/0x80
[] ret_from_fork+0x7c/0xb0
[] ? flush_kthread_worker+0x80/0x80
[SNIP]
RIP [] unlock_page+0x35/0x40
RSP
---[ end trace 421e79ad0dd72c7d ]---

it is because we forgot to lock the page again after we read data to
the page. Fix it.

Signed-off-by: Lin Feng
Signed-off-by: Miao Xie
Signed-off-by: Josef Bacik

Miao Xie
2013-07-02 23:50:55 +0800
f51a4a182 Btrfs: remove btrfs_sector_sum structure ... Browse Code »

Using the structure btrfs_sector_sum to keep the checksum value is
unnecessary, because the extents that btrfs_sector_sum points to are
continuous, we can find out the expected checksums by btrfs_ordered_sum's
bytenr and the offset, so we can remove btrfs_sector_sum's bytenr. After
removing bytenr, there is only one member in the structure, so it makes
no sense to keep the structure, just remove it, and use a u32 array to
store the checksum value.

By this change, we don't use the while loop to get the checksums one by
one. Now, we can get several checksum value at one time, it improved the
performance by ~74% on my SSD (31MB/s -> 54MB/s).

test command:
# dd if=/dev/zero of=/mnt/btrfs/file0 bs=1M count=1024 oflag=sync

Signed-off-by: Miao Xie
Signed-off-by: Josef Bacik

Miao Xie
2013-07-02 23:50:47 +0800

01 Jul, 2013

1 commit

d88d46c6e Btrfs: free csums when we're done scrubbing an extent ... Browse Code »

A user reported scrub taking up an unreasonable amount of ram as it ran. This
is because we lookup the csums for the extent we're scrubbing but don't free it
up until after we're done with the scrub, which means we can take up a whole lot
of ram. This patch fixes this by dropping the csums once we're done with the
extent we've scrubbed. The user reported this to fix their problem. Thanks,

Reported-and-tested-by: Remco Hosman
Signed-off-by: Josef Bacik

Josef Bacik
2013-07-01 20:52:28 +0800

18 May, 2013

1 commit

9be3395bc Btrfs: use a btrfs bioset instead of abusing bio internals ... Browse Code »

Btrfs has been pointer tagging bi_private and using bi_bdev
to store the stripe index and mirror number of failed IOs.

As bios bubble back up through the call chain, we use these
to decide if and how to retry our IOs. They are also used
to count IO failures on a per device basis.

Recently a bio tracepoint was added lead to crashes because
we were abusing bi_bdev.

This commit adds a btrfs bioset, and creates explicit fields
for the mirror number and stripe index. The plan is to
extend this structure for all of the fields currently in
struct btrfs_bio, which will mean one less kmalloc in
our IO path.

Signed-off-by: Chris Mason
Reported-by: Tejun Heo

Chris Mason
2013-05-18 09:52:52 +0800

07 May, 2013

3 commits

625f1c8dc Btrfs: improve the loop of scrub_stripe ... Browse Code »

1) Right now scrub_stripe() is looping in some unnecessary cases:
* when the found extent item's objectid has been out of the dev extent's range
but we haven't finish scanning all the range within the dev extent
* when all the items has been processed but we haven't finish scanning all the
range within the dev extent

In both cases, we can just finish the loop to save costs.

2) Besides, when the found extent item's length is larger than the stripe
len(64k), we don't have to release the path and search again as it'll get at the
same key used in the last loop, we can instead increase the logical cursor in
place till all space of the extent is scanned.

3) And we use 0 as the key's offset to search btree, then get to previous item
to find a smaller item, and again have to move to the next one to get the right
item. Setting offset=-1 and previous_item() is the correct way.

4) As we won't find any checksum at offset unless this 'offset' is in a data
extent, we can just find checksum when we're really going to scrub an extent.

Signed-off-by: Liu Bo
Signed-off-by: Josef Bacik

Liu Bo
2013-05-07 03:55:26 +0800
48a3b6366 btrfs: make static code static & remove dead code ... Browse Code »

Big patch, but all it does is add statics to functions which
are in fact static, then remove the associated dead-code fallout.

removed functions:

btrfs_iref_to_path()
__btrfs_lookup_delayed_deletion_item()
__btrfs_search_delayed_insertion_item()
__btrfs_search_delayed_deletion_item()
find_eb_for_page()
btrfs_find_block_group()
range_straddles_pages()
extent_range_uptodate()
btrfs_file_extent_length()
btrfs_scrub_cancel_devid()
btrfs_start_transaction_lflush()

btrfs_print_tree() is left because it is used for debugging.
btrfs_start_transaction_lflush() and btrfs_reada_detach() are
left for symmetry.

ulist.c functions are left, another patch will take care of those.

Signed-off-by: Eric Sandeen
Signed-off-by: Josef Bacik

Eric Sandeen
2013-05-07 03:55:23 +0800
3173a18f7 Btrfs: add a incompatible format change for smaller metadata extent refs ... Browse Code »
13

We currently store the first key of the tree block inside the reference for the
tree block in the extent tree. This takes up quite a bit of space. Make a new
key type for metadata which holds the level as the offset and completely removes
storing the btrfs_tree_block_info inside the extent ref. This reduces the size
from 51 bytes to 33 bytes per extent reference for each tree block. In practice
this results in a 30-35% decrease in the size of our extent tree, which means we
COW less and can keep more of the extent tree in memory which makes our heavy
metadata operations go much faster. This is not an automatic format change, you
must enable it at mkfs time or with btrfstune. This patch deals with having
metadata stored as either the old format or the new format so it is easy to
convert. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2013-05-07 03:54:18 +0800