Doug / smarc-fsl-linux-kernel | Embedian Git Server

23 Sep, 2013

1 commit

0fbf2cc98 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull btrfs fixes from Chris Mason:
"These are mostly bug fixes and a two small performance fixes. The
most important of the bunch are Josef's fix for a snapshotting
regression and Mark's update to fix compile problems on arm"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (25 commits)
Btrfs: create the uuid tree on remount rw
btrfs: change extent-same to copy entire argument struct
Btrfs: dir_inode_operations should use btrfs_update_time also
btrfs: Add btrfs: prefix to kernel log output
btrfs: refuse to remount read-write after abort
Btrfs: btrfs_ioctl_default_subvol: Revert back to toplevel subvolume when arg is 0
Btrfs: don't leak transaction in btrfs_sync_file()
Btrfs: add the missing mutex unlock in write_all_supers()
Btrfs: iput inode on allocation failure
Btrfs: remove space_info->reservation_progress
Btrfs: kill delay_iput arg to the wait_ordered functions
Btrfs: fix worst case calculator for space usage
Revert "Btrfs: rework the overcommit logic to be based on the total size"
Btrfs: improve replacing nocow extents
Btrfs: drop dir i_size when adding new names on replay
Btrfs: replay dir_index items before other items
Btrfs: check roots last log commit when checking if an inode has been logged
Btrfs: actually log directory we are fsync()'ing
Btrfs: actually limit the size of delalloc range
Btrfs: allocate the free space by the existed max extent size when ENOSPC
...

Linus Torvalds
2013-09-23 05:58:49 +0800

21 Sep, 2013

1 commit

a48203988 Btrfs: allocate the free space by the existed max extent size when ENOSPC ... Browse Code »

By the current code, if the requested size is very large, and all the extents
in the free space cache are small, we will waste lots of the cpu time to cut
the requested size in half and search the cache again and again until it gets
down to the size the allocator can return. In fact, we can know the max extent
size in the cache after the first search, so we needn't cut the size in half
repeatedly, and just use the max extent size directly. This way can save
lots of cpu time and make the performance grow up when there are only fragments
in the free space cache.

According to my test, if there are only 4KB free space extents in the fs,
and the total size of those extents are 256MB, we can reduce the execute
time of the following test from 5.4s to 1.4s.
dd if=/dev/zero of= bs=1MB count=1 oflag=sync

Changelog v2 -> v3:
- fix the problem that we skip the block group with the space which is
less than we need.

Changelog v1 -> v2:
- address the problem that we return a wrong start position when searching
the free space in a bitmap.

Signed-off-by: Miao Xie
Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Miao Xie
2013-09-21 23:05:23 +0800

13 Sep, 2013

2 commits

ac4de9543 Merge branch 'akpm' (patches from Andrew Morton) ... Browse Code »

Merge more patches from Andrew Morton:
"The rest of MM. Plus one misc cleanup"

* emailed patches from Andrew Morton : (35 commits)
mm/Kconfig: add MMU dependency for MIGRATION.
kernel: replace strict_strto*() with kstrto*()
mm, thp: count thp_fault_fallback anytime thp fault fails
thp: consolidate code between handle_mm_fault() and do_huge_pmd_anonymous_page()
thp: do_huge_pmd_anonymous_page() cleanup
thp: move maybe_pmd_mkwrite() out of mk_huge_pmd()
mm: cleanup add_to_page_cache_locked()
thp: account anon transparent huge pages into NR_ANON_PAGES
truncate: drop 'oldsize' truncate_pagecache() parameter
mm: make lru_add_drain_all() selective
memcg: document cgroup dirty/writeback memory statistics
memcg: add per cgroup writeback pages accounting
memcg: check for proper lock held in mem_cgroup_update_page_stat
memcg: remove MEMCG_NR_FILE_MAPPED
memcg: reduce function dereference
memcg: avoid overflow caused by PAGE_ALIGN
memcg: rename RESOURCE_MAX to RES_COUNTER_MAX
memcg: correct RESOURCE_MAX to ULLONG_MAX
mm: memcg: do not trap chargers with full callstack on OOM
mm: memcg: rework and document OOM waiting and wakeup
...

Linus Torvalds
2013-09-13 06:44:27 +0800
7caef2676 truncate: drop 'oldsize' truncate_pagecache() parameter ... Browse Code »

truncate_pagecache() doesn't care about old size since commit
cedabed49b39 ("vfs: Fix vmtruncate() regression"). Let's drop it.

Signed-off-by: Kirill A. Shutemov
Cc: OGAWA Hirofumi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kirill A. Shutemov
2013-09-13 06:38:02 +0800

01 Sep, 2013

4 commits

b12d6869f Btrfs: convert all bug_ons in free-space-cache.c ... Browse Code »

All of these are logic checks to make sure we're not breaking anything, so
convert them over to ASSERT(). Thanks,

Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Josef Bacik
2013-09-01 20:16:33 +0800
c1c9ff7c9 Btrfs: Remove superfluous casts from u64 to unsigned long long ... Browse Code »

u64 is "unsigned long long" on all architectures now, so there's no need to
cast it when formatting it using the "ll" length modifier.

Signed-off-by: Geert Uytterhoeven
Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Geert Uytterhoeven
2013-09-01 20:16:08 +0800
dc11dd5d7 Btrfs: separate out tests into their own directory ... Browse Code »

The plan is to have a bunch of unit tests that run when btrfs is loaded when you
build with the appropriate config option. My ultimate goal is to have a test
for every non-static function we have, but at first I'm going to focus on the
things that cause us the most problems. To start out with this just adds a
tests/ directory and moves the existing free space cache tests into that
directory and sets up all of the infrastructure. Thanks,

Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Josef Bacik
2013-09-01 20:15:38 +0800
00361589d Btrfs: avoid starting a transaction in the write path ... Browse Code »

I noticed while looking at a deadlock that we are always starting a transaction
in cow_file_range(). This isn't really needed since we only need a transaction
if we are doing an inline extent, or if the allocator needs to allocate a chunk.
So push down all the transaction start stuff to be closer to where we actually
need a transaction in all of these cases. This will hopefully reduce our write
latency when we are committing often. Thanks,

Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Josef Bacik
2013-09-01 20:05:05 +0800

10 Jul, 2013

1 commit

e3a0dd98e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull btrfs update from Chris Mason:
"These are the usual mixture of bugs, cleanups and performance fixes.
Miao has some really nice tuning of our crc code as well as our
transaction commits.

Josef is peeling off more and more problems related to early enospc,
and has a number of important bug fixes in here too"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (81 commits)
Btrfs: wait ordered range before doing direct io
Btrfs: only do the tree_mod_log_free_eb if this is our last ref
Btrfs: hold the tree mod lock in __tree_mod_log_rewind
Btrfs: make backref walking code handle skinny metadata
Btrfs: fix crash regarding to ulist_add_merge
Btrfs: fix several potential problems in copy_nocow_pages_for_inode
Btrfs: cleanup the code of copy_nocow_pages_for_inode()
Btrfs: fix oops when recovering the file data by scrub function
Btrfs: make the chunk allocator completely tree lockless
Btrfs: cleanup orphaned root orphan item
Btrfs: fix wrong mirror number tuning
Btrfs: cleanup redundant code in btrfs_submit_direct()
Btrfs: remove btrfs_sector_sum structure
Btrfs: check if we can nocow if we don't have data space
Btrfs: stop using try_to_writeback_inodes_sb_nr to flush delalloc
Btrfs: use a percpu to keep track of possibly pinned bytes
Btrfs: check for actual acls rather than just xattrs when caching no acl
Btrfs: move btrfs_truncate_page to btrfs_cont_expand instead of btrfs_truncate
Btrfs: optimize reada_for_balance
Btrfs: optimize read_block_for_search
...

Linus Torvalds
2013-07-10 03:33:09 +0800

14 Jun, 2013

3 commits

4b286cd1f Btrfs: return error code in btrfs_check_trunc_cache_free_space() ... Browse Code »

Fix to return error code instead always return 0 from function
btrfs_check_trunc_cache_free_space().
Introduced by commit 7b61cd92242542944fc27024900c495a6a7b3396
(Btrfs: don't use global block reservation for inode cache truncation)

Signed-off-by: Wei Yongjun
Reviewed-by: Miao Xie
Signed-off-by: Josef Bacik

Wei Yongjun
2013-06-14 23:29:56 +0800
e6d296058 btrfs: move ifdef around sanity checks out of init_btrfs_fs ... Browse Code »

Signed-off-by: David Sterba
Signed-off-by: Josef Bacik

David Sterba
2013-06-14 23:29:18 +0800
905d0f564 btrfs: add prefix to sanity tests messages ... Browse Code »

And change the message level to KERN_INFO.

Signed-off-by: David Sterba
Signed-off-by: Josef Bacik

David Sterba
2013-06-14 23:29:17 +0800

28 May, 2013

1 commit

8b513d0cf treewide: Fix typo in printk ... Browse Code »

Correct spelling typo in various part of drivers

Signed-off-by: Masanari Iida
Signed-off-by: Jiri Kosina

Masanari Iida
2013-05-28 18:02:13 +0800

18 May, 2013

2 commits

7b61cd922 Btrfs: don't use global block reservation for inode cache truncation ... Browse Code »

It is very likely that there are lots of subvolumes/snapshots in the filesystem,
so if we use global block reservation to do inode cache truncation, we may hog
all the free space that is reserved in global rsv. So it is better that we do
the free space reservation for inode cache truncation by ourselves.

Cc: Tsutomu Itoh
Signed-off-by: Miao Xie
Signed-off-by: Josef Bacik

Miao Xie
2013-05-18 09:40:22 +0800
73e1e61fb Btrfs: remove warn on in free space cache writeout ... Browse Code »

This catches block groups that are too large to properly cache. We deal with
this case fine, so the warning just confuses users. Remove the warning.
Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2013-05-18 09:40:13 +0800

07 May, 2013

5 commits

48a3b6366 btrfs: make static code static & remove dead code ... Browse Code »

Big patch, but all it does is add statics to functions which
are in fact static, then remove the associated dead-code fallout.

removed functions:

btrfs_iref_to_path()
__btrfs_lookup_delayed_deletion_item()
__btrfs_search_delayed_insertion_item()
__btrfs_search_delayed_deletion_item()
find_eb_for_page()
btrfs_find_block_group()
range_straddles_pages()
extent_range_uptodate()
btrfs_file_extent_length()
btrfs_scrub_cancel_devid()
btrfs_start_transaction_lflush()

btrfs_print_tree() is left because it is used for debugging.
btrfs_start_transaction_lflush() and btrfs_reada_detach() are
left for symmetry.

ulist.c functions are left, another patch will take care of those.

Signed-off-by: Eric Sandeen
Signed-off-by: Josef Bacik

Eric Sandeen
2013-05-07 03:55:23 +0800
b50c6e250 Btrfs: deal with free space cache errors while replaying log ... Browse Code »

So everybody who got hit by my fsync bug will still continue to hit this
BUG_ON() in the free space cache, which is pretty heavy handed. So I took a
file system that had this bug and fixed up all the BUG_ON()'s and leaks that
popped up when I tried to mount a broken file system like this. With this patch
we just fail to mount instead of panicing. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2013-05-07 03:55:20 +0800
c2cf52eb7 Btrfs: Include the device in most error printk()s ... Browse Code »

With more than one btrfs volume mounted, it can be very difficult to find
out which volume is hitting an error. btrfs_error() will print this, but
it is currently rigged as more of a fatal error handler, while many of
the printk()s are currently for debugging and yet-unhandled cases.

This patch just changes the functions where the device information is
already available. Some cases remain where the root or fs_info is not
passed to the function emitting the error.

This may introduce some confusion with volumes backed by multiple devices
emitting errors referring to the primary device in the set instead of the
one on which the error occurred.

Use btrfs_printk(fs_info, format, ...) rather than writing the device
string every time, and introduce macro wrappers ala XFS for brevity.
Since the function already cannot be used for continuations, print a
newline as part of the btrfs_printk() message rather than at each caller.

Signed-off-by: Simon Kirby
Reviewed-by: David Sterba
Signed-off-by: Josef Bacik

Simon Kirby
2013-05-07 03:54:23 +0800
b0496686b Btrfs: cleanup unused arguments of btrfs_csum_data ... Browse Code »

Argument 'root' is no more used in btrfs_csum_data().

Signed-off-by: Liu Bo
Signed-off-by: Josef Bacik

Liu Bo
2013-05-07 03:54:14 +0800
74255aa07 Btrfs: add some free space cache tests ... Browse Code »

We keep hitting bugs in the tree log replay because btrfs_remove_free_space
doesn't account for some corner case. So add a bunch of tests to try and fully
test btrfs_remove_free_space since the only time it is called is during tree log
replay. These tests all finish successfully, so as we find more of these bugs
we need to add to these tests to make sure we don't regress in fixing things.
I've hidden the tests behind a Kconfig option, but they take no time to run so
all btrfs developers should have this turned on all the time. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2013-05-07 03:52:54 +0800

21 Feb, 2013

2 commits

e942f883b Merge branch 'raid56-experimental' into for-linus-3.9 ... Browse Code »

Signed-off-by: Chris Mason

Conflicts:
fs/btrfs/ctree.h
fs/btrfs/extent-tree.c
fs/btrfs/inode.c
fs/btrfs/volumes.c

Chris Mason
2013-02-21 03:06:05 +0800
dde5740fd Btrfs: relax the block group size limit for bitmaps ... Browse Code »

Dave pointed out that xfstests 273 will tell you that it failed to load the
space cache for a block group when it remounts. This is because we run out
of space writing out the block group cache. This is ok and is working as it
should, but let's try to be a bit nicer. This happens because the block
group was 100mb, but bitmap entries cover 128mb, so we were only getting
extent entries for this block group, which ended up being too many to fit in
the free space cache. So relax the bitmap size requirements to block groups
that are at least half the size a bitmap will cover or larger, that way we
can still keep the amount of space used in the free space cache low enough
to be able to write it out. With this patch I no longer fail to write out
the free space cache. Thanks,

Reported-by: David Sterba
Signed-off-by: Josef Bacik

Josef Bacik
2013-02-21 01:59:55 +0800

05 Feb, 2013

1 commit

0e4e02636 Merge branch 'for-linus' into raid56-experimental ... Browse Code »

Conflicts:
fs/btrfs/volumes.c

Signed-off-by: Chris Mason

Chris Mason
2013-02-05 23:04:03 +0800

02 Feb, 2013

1 commit

53b381b3a Btrfs: RAID5 and RAID6 ... Browse Code »

This builds on David Woodhouse's original Btrfs raid5/6 implementation.
The code has changed quite a bit, blame Chris Mason for any bugs.

Read/modify/write is done after the higher levels of the filesystem have
prepared a given bio. This means the higher layers are not responsible
for building full stripes, and they don't need to query for the topology
of the extents that may get allocated during delayed allocation runs.
It also means different files can easily share the same stripe.

But, it does expose us to incorrect parity if we crash or lose power
while doing a read/modify/write cycle. This will be addressed in a
later commit.

Scrub is unable to repair crc errors on raid5/6 chunks.

Discard does not work on raid5/6 (yet)

The stripe size is fixed at 64KiB per disk. This will be tunable
in a later commit.

Signed-off-by: Chris Mason

David Woodhouse
2013-02-02 03:24:23 +0800

25 Jan, 2013

1 commit

b0175117b Btrfs: fix panic when recovering tree log ... Browse Code »

A user reported a BUG_ON(ret) that occured during tree log replay. Ret was
-EAGAIN, so what I think happened is that we removed an extent that covered
a bitmap entry and an extent entry. We remove the part from the bitmap and
return -EAGAIN and then search for the next piece we want to remove, which
happens to be an entire extent entry, so we just free the sucker and return.
The problem is ret is still set to -EAGAIN so we trip the BUG_ON(). The
user used btrfs-zero-log so I'm not 100% sure this is what happened so I've
added a WARN_ON() to catch the other possibility. Thanks,

Reported-by: Jan Steffens
Signed-off-by: Josef Bacik

Josef Bacik
2013-01-25 01:49:49 +0800

17 Dec, 2012

2 commits

960097622 Btrfs: use ctl->unit for free space calculation instead of block_group->sectorsize ... Browse Code »

We should use ctl->unit for free space calculation instead of block_group->sectorsize
even though for free space use_bitmap or free space cluster we only have sectorsize assigned to ctl->unit currently. Also, we can keep it consisten in code style.

Signed-off-by: Wang Sheng-Hui
Signed-off-by: Chris Mason

Wang Sheng-Hui
2012-12-17 09:46:17 +0800
071401258 Btrfs: do not warn_on io_ctl->cur in io_ctl_map_page ... Browse Code »

io_ctl_map_page is called by many functions in free-space-cache.
In most scenarios, the ->cur is not null, e.g. io_ctl_add_entry.
I think we'd better remove the warn_on here.

Signed-off-by: Wang Sheng-Hui
Signed-off-by: Chris Mason

Wang Sheng-Hui
2012-12-17 09:46:06 +0800

12 Dec, 2012

1 commit

de6c4115a Btrfs: fix unnecessary while loop when search the free space, cache ... Browse Code »

When we find a bitmap free space entry, we may check the previous extent
entry covers the offset or not. But if we find this entry is also a bitmap
entry, we will continue to check the previous entry of the current one by
a while loop. It is unnecessary because it is impossible that the extent
entry which is in front of a bitmap entry can cover the offset of the entry
after that bitmap entry.

Signed-off-by: Miao Xie
Reviewed-by: Liu Bo
Signed-off-by: Chris Mason

Miao Xie
2012-12-12 02:31:33 +0800

09 Oct, 2012

1 commit

e6138876a Btrfs: cache extent state when writing out dirty metadata pages ... Browse Code »

Everytime we write out dirty pages we search for an offset in the tree,
convert the bits in the state, and then when we wait we search for the
offset again and clear the bits. So for every dirty range in the io tree we
are doing 4 rb searches, which is suboptimal. With this patch we are only
doing 2 searches for every cycle (modulo weird things happening). Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2012-10-09 21:15:41 +0800

04 Oct, 2012

1 commit

ebb3dad43 Btrfs: using for_each_set_bit_from to simplify the code ... Browse Code »

Using for_each_set_bit_from() to simplify the code.

spatch with a semantic match is used to found this.
(http://coccinelle.lip6.fr/)

Signed-off-by: Wei Yongjun

Wei Yongjun
2012-10-04 21:39:55 +0800

24 Jul, 2012

1 commit

f6175efab Btrfs: do not count in readonly bytes ... Browse Code »

If a block group is ro, do not count its entries in when we dump space info.

Signed-off-by: Liu Bo
Signed-off-by: Josef Bacik

Liu Bo
2012-07-24 04:28:03 +0800

06 Jul, 2012

1 commit

5eecb9cc9 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull btrfs updates from Chris Mason:
"I held off on my rc5 pull because I hit an oops during log recovery
after a crash. I wanted to make sure it wasn't a regression because
we have some logging fixes in here.

It turns out that a commit during the merge window just made it much
more likely to trigger directory logging instead of full commits,
which exposed an old bug.

The new backref walking code got some additional fixes. This should
be the final set of them.

Josef fixed up a corner where our O_DIRECT writes and buffered reads
could expose old file contents (not stale, just not the most recent).
He and Liu Bo fixed crashes during tree log recover as well.

Ilya fixed errors while we resume disk balancing operations on
readonly mounts."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: run delayed directory updates during log replay
Btrfs: hold a ref on the inode during writepages
Btrfs: fix tree log remove space corner case
Btrfs: fix wrong check during log recovery
Btrfs: use _IOR for BTRFS_IOC_SUBVOL_GETFLAGS
Btrfs: resume balance on rw (re)mounts properly
Btrfs: restore restriper state on all mounts
Btrfs: fix dio write vs buffered read race
Btrfs: don't count I/O statistic read errors for missing devices
Btrfs: resolve tree mod log locking issue in btrfs_next_leaf
Btrfs: fix tree mod log rewind of ADD operations
Btrfs: leave critical region in btrfs_find_all_roots as soon as possible
Btrfs: always put insert_ptr modifications into the tree mod log
Btrfs: fix tree mod log for root replacements at leaf level
Btrfs: support root level changes in __resolve_indirect_ref
Btrfs: avoid waiting for delayed refs when we must not

Linus Torvalds
2012-07-06 04:06:25 +0800

03 Jul, 2012

1 commit

bdb7d303b Btrfs: fix tree log remove space corner case ... Browse Code »

The tree log stuff can have allocated space that we end up having split
across a bitmap and a real extent. The free space code does not deal with
this, it assumes that if it finds an extent or bitmap entry that the entire
range must fall within the entry it finds. This isn't necessarily the case,
so rework the remove function so it can handle this case properly. This
fixed two panics the user hit, first in the case where the space was
initially in a bitmap and then in an extent entry, and then the reverse
case. Thanks,

Reported-and-tested-by: Shaun Reich
Signed-off-by: Josef Bacik

Josef Bacik
2012-07-03 03:39:18 +0800

02 Jun, 2012

1 commit

1193755ac Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull vfs changes from Al Viro.
"A lot of misc stuff. The obvious groups:
* Miklos' atomic_open series; kills the damn abuse of
->d_revalidate() by NFS, which was the major stumbling block for
all work in that area.
* ripping security_file_mmap() and dealing with deadlocks in the
area; sanitizing the neighborhood of vm_mmap()/vm_munmap() in
general.
* ->encode_fh() switched to saner API; insane fake dentry in
mm/cleancache.c gone.
* assorted annotations in fs (endianness, __user)
* parts of Artem's ->s_dirty work (jff2 and reiserfs parts)
* ->update_time() work from Josef.
* other bits and pieces all over the place.

Normally it would've been in two or three pull requests, but
signal.git stuff had eaten a lot of time during this cycle ;-/"

Fix up trivial conflicts in Documentation/filesystems/vfs.txt (the
'truncate_range' inode method was removed by the VM changes, the VFS
update adds an 'update_time()' method), and in fs/btrfs/ulist.[ch] (due
to sparse fix added twice, with other changes nearby).

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (95 commits)
nfs: don't open in ->d_revalidate
vfs: retry last component if opening stale dentry
vfs: nameidata_to_filp(): don't throw away file on error
vfs: nameidata_to_filp(): inline __dentry_open()
vfs: do_dentry_open(): don't put filp
vfs: split __dentry_open()
vfs: do_last() common post lookup
vfs: do_last(): add audit_inode before open
vfs: do_last(): only return EISDIR for O_CREAT
vfs: do_last(): check LOOKUP_DIRECTORY
vfs: do_last(): make ENOENT exit RCU safe
vfs: make follow_link check RCU safe
vfs: do_last(): use inode variable
vfs: do_last(): inline walk_component()
vfs: do_last(): make exit RCU safe
vfs: split do_lookup()
Btrfs: move over to use ->update_time
fs: introduce inode operation ->update_time
reiserfs: get rid of resierfs_sync_super
reiserfs: mark the superblock as dirty a bit later
...

Linus Torvalds
2012-06-02 01:34:35 +0800

30 May, 2012

3 commits

cd023e7b1 Btrfs: merge contigous regions when loading free space cache ... Browse Code »

When we write out the free space cache we will write out everything that is
in our in memory tree, and then we will just walk the pinned extents tree
and write anything we see there. The problem with this is that during
normal operations the pinned extents will be merged back into the free space
tree normally, and then we can allocate space from the merged areas and
commit them to the tree log. If we crash and replay the tree log we will
crash again because the tree log will try to free up space from what looks
like 2 seperate but contiguous entries, since one entry is from the original
free space cache and the other was a pinned extent that was merged back. To
fix this we just need to walk the free space tree after we load it and merge
contiguous entries back together. This will keep the tree log stuff from
breaking and it will make the allocator behave more nicely. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2012-05-30 22:23:36 +0800
5fd020435 Btrfs: finish ordered extents in their own thread ... Browse Code »

We noticed that the ordered extent completion doesn't really rely on having
a page and that it could be done independantly of ending the writeback on a
page. This patch makes us not do the threaded endio stuff for normal
buffered writes and direct writes so we can end page writeback as soon as
possible (in irq context) and only start threads to do the ordered work when
it is actually done. Compression needs to be reworked some to take
advantage of this as well, but atm it has to do a find_get_page in its endio
handler so it must be done in its own thread. This makes direct writes
quite a bit faster. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2012-05-30 22:23:33 +0800
528c03276 btrfs: trivial endianness annotations ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-05-30 11:28:35 +0800

14 Apr, 2012

1 commit

659e45d8a Merge branch 'for-linus-min' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull the minimal btrfs branch from Chris Mason:
"We have a use-after-free in there, along with errors when mount -o
discard is enabled, and a BUG_ON(we should compile with UP more
often)."

* 'for-linus-min' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: use commit root when loading free space cache
Btrfs: fix use-after-free in __btrfs_end_transaction
Btrfs: check return value of bio_alloc() properly
Btrfs: remove lock assert from get_restripe_target()
Btrfs: fix eof while discarding extents
Btrfs: fix uninit variable in repair_eb_io_failure
Revert "Btrfs: increase the global block reserve estimates"

Linus Torvalds
2012-04-14 10:41:27 +0800

13 Apr, 2012

1 commit

d53ba4748 Btrfs: use commit root when loading free space cache ... Browse Code »

A user reported that booting his box up with btrfs root on 3.4 was way
slower than on 3.3 because I removed the ideal caching code. It turns out
that we don't load the free space cache if we're in a commit for deadlock
reasons, but since we're reading the cache and it hasn't changed yet we are
safe reading the inode and free space item from the commit root, so do that
and remove all of the deadlock checks so we don't unnecessarily skip loading
the free space cache. The user reported this fixed the slowness. Thanks,

Tested-by: Calvin Walton
Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Josef Bacik
2012-04-13 08:54:01 +0800

31 Mar, 2012

1 commit

9613bebb2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull btrfs fixes and features from Chris Mason:
"We've merged in the error handling patches from SuSE. These are
already shipping in the sles kernel, and they give btrfs the ability
to abort transactions and go readonly on errors. It involves a lot of
churn as they clarify BUG_ONs, and remove the ones we now properly
deal with.

Josef reworked the way our metadata interacts with the page cache.
page->private now points to the btrfs extent_buffer object, which
makes everything faster. He changed it so we write an whole extent
buffer at a time instead of allowing individual pages to go down,,
which will be important for the raid5/6 code (for the 3.5 merge
window ;)

Josef also made us more aggressive about dropping pages for metadata
blocks that were freed due to COW. Overall, our metadata caching is
much faster now.

We've integrated my patch for metadata bigger than the page size.
This allows metadata blocks up to 64KB in size. In practice 16K and
32K seem to work best. For workloads with lots of metadata, this cuts
down the size of the extent allocation tree dramatically and fragments
much less.

Scrub was updated to support the larger block sizes, which ended up
being a fairly large change (thanks Stefan Behrens).

We also have an assortment of fixes and updates, especially to the
balancing code (Ilya Dryomov), the back ref walker (Jan Schmidt) and
the defragging code (Liu Bo)."

Fixed up trivial conflicts in fs/btrfs/scrub.c that were just due to
removal of the second argument to k[un]map_atomic() in commit
7ac687d9e047.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (75 commits)
Btrfs: update the checks for mixed block groups with big metadata blocks
Btrfs: update to the right index of defragment
Btrfs: do not bother to defrag an extent if it is a big real extent
Btrfs: add a check to decide if we should defrag the range
Btrfs: fix recursive defragment with autodefrag option
Btrfs: fix the mismatch of page->mapping
Btrfs: fix race between direct io and autodefrag
Btrfs: fix deadlock during allocating chunks
Btrfs: show useful info in space reservation tracepoint
Btrfs: don't use crc items bigger than 4KB
Btrfs: flush out and clean up any block device pages during mount
btrfs: disallow unequal data/metadata blocksize for mixed block groups
Btrfs: enhance superblock sanity checks
Btrfs: change scrub to support big blocks
Btrfs: minor cleanup in scrub
Btrfs: introduce common define for max number of mirrors
Btrfs: fix infinite loop in btrfs_shrink_device()
Btrfs: fix memory leak in resolver code
Btrfs: allow dup for data chunks in mixed mode
Btrfs: validate target profiles only if we are going to use them
...

Linus Torvalds
2012-03-31 03:44:29 +0800