Doug / smarc-fsl-linux-kernel | Embedian Git Server

27 Jul, 2012

1 commit

e2aed8dfa Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull large btrfs update from Chris Mason:
"This pull request is very large, and the two main features in here
have been under testing/devel for quite a while.

We have subvolume quotas from the strato developers. This enables
full tracking of how many blocks are allocated to each subvolume (and
all snapshots) and you can set limits on a per-subvolume basis. You
can also create quota groups and toss multiple subvolumes into a big
group. It's everything you need to be a web hosting company and give
each user their own subvolume.

The userland side of the quotas is being refreshed, they'll send out
details on where to grab it soon.

Next is the kernel side of btrfs send/receive from Alexander Block.
This leverages the same infrastructure as the quota code to figure out
relationships between blocks and their owners. It can then compute
the difference between two snapshots and sends the diffs in a neutral
format into userland.

The basic model:

create a snapshot
send that snapshot as the initial backup
make changes
create a second snapshot
send the incremental as a backup
delete the first snapshot
(use the second snapshot for the next incremental)

The receive portion is all in userland, and in the 'next' branch of my
btrfs-progs repo.

There's still some work to do in terms of optimizing the send side
from kernel to userland. The really important part is figuring out
how two snapshots are different, and this is where we are
concentrating right now. The initial send of a dataset is a little
slower than tar, but the incremental sends are dramatically faster
than what rsync can do.

On top of all of that, we have a nice queue of fixes, cleanups and
optimizations."

Fix up trivial modify/del conflict in fs/btrfs/ioctl.c

Also fix up semantic conflict in fs/btrfs/send.c: the interface to
dentry_open() changed in commit 765927b2d508 ("switch dentry_open() to
struct path, make it grab references itself"), and since it now grabs
whatever references it needs, we should no longer do the mntget() on the
mnt (and we need to dput() the dentry reference we took).

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (65 commits)
Btrfs: uninit variable fixes in send/receive
Btrfs: introduce BTRFS_IOC_SEND for btrfs send/receive
Btrfs: add btrfs_compare_trees function
Btrfs: introduce subvol uuids and times
Btrfs: make iref_to_path non static
Btrfs: add a barrier before a waitqueue_active check
Btrfs: call the ordered free operation without any locks held
Btrfs: Check INCOMPAT flags on remount and add helper function
Btrfs: add helper for tree enumeration
btrfs: allow cross-subvolume file clone
Btrfs: improve multi-thread buffer read
Btrfs: make btrfs's allocation smoothly with preallocation
Btrfs: lock the transition from dirty to writeback for an eb
Btrfs: fix potential race in extent buffer freeing
Btrfs: don't return true in releasepage unless we actually freed the eb
Btrfs: suppress printk() if all device I/O stats are zero
Btrfs: remove unwanted printk() for btrfs device I/O stats
Btrfs: rewrite BTRFS_SETGET_FUNCS
Btrfs: zero unused bytes in inode item
Btrfs: kill free_space pointer from inode structure
...

Conflicts:
fs/btrfs/ioctl.c

Linus Torvalds
2012-07-27 05:48:55 +0800

25 Jul, 2012

1 commit

d14b7a419 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial ... Browse Code »

Pull trivial tree from Jiri Kosina:
"Trivial updates all over the place as usual."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (29 commits)
Fix typo in include/linux/clk.h .
pci: hotplug: Fix typo in pci
iommu: Fix typo in iommu
video: Fix typo in drivers/video
Documentation: Add newline at end-of-file to files lacking one
arm,unicore32: Remove obsolete "select MISC_DEVICES"
module.c: spelling s/postition/position/g
cpufreq: Fix typo in cpufreq driver
trivial: typo in comment in mksysmap
mach-omap2: Fix typo in debug message and comment
scsi: aha152x: Fix sparse warning and make printing pointer address more portable.
Change email address for Steve Glendinning
Btrfs: fix typo in convert_extent_bit
via: Remove bogus if check
netprio_cgroup.c: fix comment typo
backlight: fix memory leak on obscure error path
Documentation: asus-laptop.txt references an obsolete Kconfig item
Documentation: ManagementStyle: fixed typo
mm/vmscan: cleanup comment error in balance_pgdat
mm: cleanup on the comments of zone_reclaim_stat
...

Linus Torvalds
2012-07-25 04:34:56 +0800

24 Jul, 2012

5 commits

67c9684f4 Btrfs: improve multi-thread buffer read ... Browse Code »

While testing with my buffer read fio jobs[1], I find that btrfs does not
perform well enough.

Here is a scenario in fio jobs:

We have 4 threads, "t1 t2 t3 t4", starting to buffer read a same file,
and all of them will race on add_to_page_cache_lru(), and if one thread
successfully puts its page into the page cache, it takes the responsibility
to read the page's data.

And what's more, reading a page needs a period of time to finish, in which
other threads can slide in and process rest pages:

t1 t2 t3 t4
add Page1
read Page1 add Page2
| read Page2 add Page3
| | read Page3 add Page4
| | | read Page4
-----|------------|-----------|-----------|--------
v v v v
bio bio bio bio

Now we have four bios, each of which holds only one page since we need to
maintain consecutive pages in bio. Thus, we can end up with far more bios
than we need.

Here we're going to
a) delay the real read-page section and
b) try to put more pages into page cache.

With that said, we can make each bio hold more pages and reduce the number
of bios we need.

Here is some numbers taken from fio results:
w/o patch w patch
------------- -------- ---------------
READ: 745MB/s +25% 934MB/s

[1]:
[global]
group_reporting
thread
numjobs=4
bs=32k
rw=read
ioengine=sync
directory=/mnt/btrfs/

[READ]
filename=foobar
size=2000M
invalidate=1

Signed-off-by: Liu Bo
Signed-off-by: Josef Bacik

Liu Bo
2012-07-24 04:28:10 +0800
51561ffec Btrfs: lock the transition from dirty to writeback for an eb ... Browse Code »

There is a small window where an eb can have no IO bits set on it, which
could potentially result in extent_buffer_under_io() returning false when we
want it to return true, which could result in not fun things happening. So
in order to protect this case we need to hold the refs_lock when we make
this transition to make sure we get reliable results out of
extent_buffer_udner_io(). Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2012-07-24 04:28:09 +0800
594831c4b Btrfs: fix potential race in extent buffer freeing ... Browse Code »

This sounds sort of impossible but it is the only thing I can think of and
at the very least it is theoretically possible so here it goes.

If we are in try_release_extent_buffer we will check that the ref count on
the extent buffer is 1 and not under IO, and then go down and clear the tree
ref. If between this check and clearing the tree ref somebody else comes in
and grabs a ref on the eb and the marks it dirty before
try_release_extent_buffer() does it's tree ref clear we can end up with a
dirty eb that will be freed while it is still dirty which will result in a
panic. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2012-07-24 04:28:09 +0800
e64860aa0 Btrfs: don't return true in releasepage unless we actually freed the eb ... Browse Code »

I noticed while looking at an extent_buffer race that we will
unconditionally return 1 if we get down to release_extent_buffer after
clearing the tree ref. However we can easily race in here and get a ref on
the eb and not actually free the eb. So make release_extent_buffer return 1
if it free'd the eb and 0 if not so we can be a little kinder to the vm.
Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2012-07-24 04:28:08 +0800
d5b025d51 btrfs read error corrected message floods the console during recovery ... Browse Code »

Changing printk_in_rcu to printk_ratelimited_in_rcu will suffice

Signed-off-by: Josef Bacik

Anand Jain
2012-07-24 04:28:04 +0800

12 Jul, 2012

1 commit

10983f2e8 Btrfs: fix typo in convert_extent_bit ... Browse Code »

It should be convert_extent_bit.

Signed-off-by: Liu Bo
Signed-off-by: Jiri Kosina

Liu Bo
2012-07-12 17:27:34 +0800

03 Jul, 2012

1 commit

7fd1a3f73 Btrfs: hold a ref on the inode during writepages ... Browse Code »

We can race with unlink and not actually be able to do our igrab in
btrfs_add_ordered_extent. This will result in all sorts of problems.
Instead of doing the complicated work to try and handle returning an error
properly from btrfs_add_ordered_extent, just hold a ref to the inode during
writepages. If we cannot grab a ref we know we're freeing this inode anyway
and can just drop the dirty pages on the floor, because screw them we're
going to invalidate them anyway. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2012-07-03 03:39:18 +0800

15 Jun, 2012

1 commit

606686eea Btrfs: use rcu to protect device->name ... Browse Code »

Al pointed out that we can just toss out the old name on a device and add a
new one arbitrarily, so anybody who uses device->name in printk could
possibly use free'd memory. Instead of adding locking around all of this he
suggested doing it with RCU, so I've introduced a struct rcu_string that
does just that and have gone through and protected all accesses to
device->name that aren't under the uuid_mutex with rcu_read_lock(). This
protects us and I will use it for dealing with removing the device that we
used to mount the file system in a later patch. Thanks,

Reviewed-by: David Sterba
Signed-off-by: Josef Bacik

Josef Bacik
2012-06-15 09:29:16 +0800

01 Jun, 2012

1 commit

1e20932a2 Merge branch 'for-chris' of git://git.jan-o-sch.net/btrfs-unstable into for-linus ... Browse Code »

Conflicts:
fs/btrfs/ulist.h

Signed-off-by: Chris Mason

Chris Mason
2012-06-01 04:49:53 +0800

30 May, 2012

4 commits

442a4f630 Btrfs: add device counters for detected IO and checksum errors ... Browse Code »

The goal is to detect when drives start to get an increased error rate,
when drives should be replaced soon. Therefore statistic counters are
added that count IO errors (read, write and flush). Additionally, the
software detected errors like checksum errors and corrupted blocks are
counted.

Signed-off-by: Stefan Behrens

Stefan Behrens
2012-05-30 22:23:39 +0800
d1ac6e41d Btrfs: use fastpath in extent state ops as much as possible ... Browse Code »

Fully utilize our extent state's new helper functions to use
fastpath as much as possible.

Signed-off-by: Liu Bo
Reviewed-by: Josef Bacik

Liu Bo
2012-05-30 22:23:34 +0800
5fd020435 Btrfs: finish ordered extents in their own thread ... Browse Code »

We noticed that the ordered extent completion doesn't really rely on having
a page and that it could be done independantly of ending the writeback on a
page. This patch makes us not do the threaded endio stuff for normal
buffered writes and direct writes so we can end page writeback as soon as
possible (in irq context) and only start threads to do the ordered work when
it is actually done. Compression needs to be reworked some to take
advantage of this as well, but atm it has to do a find_get_page in its endio
handler so it must be done in its own thread. This makes direct writes
quite a bit faster. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2012-05-30 22:23:33 +0800
d7dbe9e7f Btrfs: fix compile warnings in extent_io.c ... Browse Code »

These warnings are bogus since we will always have at least one page in an
eb, but to make the compiler happy just set ret = 0 in these two cases.
Thanks,
Btrfs: fix compile warnings in extent_io.c

These warnings are bogus since we will always have at least one page in an
eb, but to make the compiler happy just set ret = 0 in these two cases.
Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2012-05-30 22:23:28 +0800

26 May, 2012

1 commit

815a51c74 Btrfs: dummy extent buffers for tree mod log ... Browse Code »

The tree modification log needs two ways to create dummy extent buffers,
once by allocating a fresh one (to rebuild an old root) and once by
cloning an existing one (to make private rewind modifications) to it.

Signed-off-by: Jan Schmidt

Jan Schmidt
2012-05-26 18:17:54 +0800

11 May, 2012

4 commits

fd5e62a37 Btrfs: remove the useless assignment to *entry in function tree_insert of file extent_io.c ... Browse Code »

In tree_insert, var *entry is used in the loop only, and is useless
out of the loop. Remove the useless assignment after the loop.

Signed-off-by: Wang Sheng-Hui

Wang Sheng-Hui
2012-05-11 22:56:40 +0800
477d7eafa Btrfs: fix the comment for find_first_extent_bit ... Browse Code »

The return value of find_first_extent_bit is 1 or 0, no < 0.
And if found something, return 0; if nothing was found, return 1.
Fix the comment.

Signed-off-by: Wang Sheng-Hui

Wang Sheng-Hui
2012-05-11 22:56:39 +0800
39bab87ba Btrfs: fix btrfs_release_extent_buffer_page with the right usage of num_extent_pages ... Browse Code »

num_extent_pages returns the number of pages in the specific range, not
the index of the last page in the eb range.

btrfs_release_extent_buffer_page is called with start_idx set 0 in current
codes, so it's not a problem yet. But the logic is indeed wrong.

Fix it here.

Signed-off-by: Wang Sheng-Hui

Wang Sheng-Hui
2012-05-11 22:56:38 +0800
1b303fc05 Btrfs: cleanup the comment for clear_state_bit in extent_io.c ... Browse Code »

No 'delete' arg is used for clear_state_bit.
Cleanup the comment.

Signed-off-by: Wang Sheng-Hui

Wang Sheng-Hui
2012-05-11 22:56:38 +0800

07 May, 2012

1 commit

271fd5d72 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull btrfs fixes from Chris Mason:
"The big ones here are a memory leak we introduced in rc1, and a
scheduling while atomic if the transid on disk doesn't match the
transid we expected. This happens for corrupt blocks, or out of date
disks.

It also fixes up the ioctl definition for our ioctl to resolve logical
inode numbers. The __u32 was a merging error and doesn't match what
we ship in the progs."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: avoid sleeping in verify_parent_transid while atomic
Btrfs: fix crash in scrub repair code when device is missing
btrfs: Fix mismatching struct members in ioctl.h
Btrfs: fix page leak when allocing extent buffers
Btrfs: Add properly locking around add_root_to_dirty_list

Linus Torvalds
2012-05-07 01:20:07 +0800

05 May, 2012

1 commit

17de39ac1 Btrfs: fix page leak when allocing extent buffers ... Browse Code »

If we happen to alloc a extent buffer and then alloc a page and notice that
page is already attached to an extent buffer, we will only unlock it and
free our existing eb. Any pages currently attached to that eb will be
properly freed, but we don't do the page_cache_release() on the page where
we noticed the other extent buffer which can cause us to leak pages and I
hope cause the weird issues we've been seeing in this area. Thanks,

Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Josef Bacik
2012-05-05 03:16:06 +0800

29 Apr, 2012

1 commit

f7b006931 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull btrfs fixes from Chris Mason:
"This has our collection of bug fixes. I missed the last rc because I
thought our patches were making NFS crash during my xfs test runs.
Turns out it was an NFS client bug fixed by someone else while I tried
to bisect it.

All of these fixes are small, but some are fairly high impact. The
biggest are fixes for our mount -o remount handling, a deadlock due to
GFP_KERNEL allocations in readdir, and a RAID10 error handling bug.

This was tested against both 3.3 and Linus' master as of this morning."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (26 commits)
Btrfs: reduce lock contention during extent insertion
Btrfs: avoid deadlocks from GFP_KERNEL allocations during btrfs_real_readdir
Btrfs: Fix space checking during fs resize
Btrfs: fix block_rsv and space_info lock ordering
Btrfs: Prevent root_list corruption
Btrfs: fix repair code for RAID10
Btrfs: do not start delalloc inodes during sync
Btrfs: fix that check_int_data mount option was ignored
Btrfs: don't count CRC or header errors twice while scrubbing
Btrfs: fix btrfs_ioctl_dev_info() crash on missing device
btrfs: don't return EINTR
Btrfs: double unlock bug in error handling
Btrfs: always store the mirror we read the eb from
fs/btrfs/volumes.c: add missing free_fs_devices
btrfs: fix early abort in 'remount'
Btrfs: fix max chunk size check in chunk allocator
Btrfs: add missing read locks in backref.c
Btrfs: don't call free_extent_buffer twice in iterate_irefs
Btrfs: Make free_ipath() deal gracefully with NULL pointers
Btrfs: avoid possible use-after-free in clear_extent_bit()
...

Linus Torvalds
2012-04-29 00:30:07 +0800

19 Apr, 2012

3 commits

5cf1ab561 Btrfs: always store the mirror we read the eb from ... Browse Code »

A user reported a panic where we were trying to fix a bad mirror but the
mirror number we were giving was 0, which is invalid. This is because we
don't do the transid verification until after the read, so as far as the
read code is concerned the read was a success. So instead store the mirror
we read from so that if there is some failure post read we know which mirror
to try next and which mirror needs to be fixed if we find a good copy of the
block. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2012-04-19 01:22:30 +0800
cdc6a3952 Btrfs: avoid possible use-after-free in clear_extent_bit() ... Browse Code »

clear_extent_bit()
{
next_node = rb_next(&state->rb_node);
...
clear_state_bit(state);

Li Zefan
2012-04-19 01:22:18 +0800
8e52acf70 Btrfs: retrurn void from clear_state_bit ... Browse Code »

Currently it returns a set of bits that were cleared, but this return
value is not used at all.

Moreover it doesn't seem to be useful, because we may clear the bits
of a few extent_states, but only the cleared bits of last one is
returned.

Signed-off-by: Li Zefan

Li Zefan
2012-04-19 01:22:16 +0800

14 Apr, 2012

1 commit

659e45d8a Merge branch 'for-linus-min' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull the minimal btrfs branch from Chris Mason:
"We have a use-after-free in there, along with errors when mount -o
discard is enabled, and a BUG_ON(we should compile with UP more
often)."

* 'for-linus-min' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: use commit root when loading free space cache
Btrfs: fix use-after-free in __btrfs_end_transaction
Btrfs: check return value of bio_alloc() properly
Btrfs: remove lock assert from get_restripe_target()
Btrfs: fix eof while discarding extents
Btrfs: fix uninit variable in repair_eb_io_failure
Revert "Btrfs: increase the global block reserve estimates"

Linus Torvalds
2012-04-14 10:41:27 +0800

13 Apr, 2012

2 commits

e627ee7bc Btrfs: check return value of bio_alloc() properly ... Browse Code »

bio_alloc() has the possibility of returning NULL.
So, it is necessary to check the return value.

Signed-off-by: Tsutomu Itoh
Signed-off-by: Chris Mason

Tsutomu Itoh
2012-04-13 04:03:56 +0800
d95603b26 Btrfs: fix uninit variable in repair_eb_io_failure ... Browse Code »

We'd have to be passing bogus extent buffers for this uninit variable to
actually be used, but set it to zero just in case.

Signed-off-by: Chris Mason

Chris Mason
2012-04-13 03:55:15 +0800

31 Mar, 2012

1 commit

9613bebb2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull btrfs fixes and features from Chris Mason:
"We've merged in the error handling patches from SuSE. These are
already shipping in the sles kernel, and they give btrfs the ability
to abort transactions and go readonly on errors. It involves a lot of
churn as they clarify BUG_ONs, and remove the ones we now properly
deal with.

Josef reworked the way our metadata interacts with the page cache.
page->private now points to the btrfs extent_buffer object, which
makes everything faster. He changed it so we write an whole extent
buffer at a time instead of allowing individual pages to go down,,
which will be important for the raid5/6 code (for the 3.5 merge
window ;)

Josef also made us more aggressive about dropping pages for metadata
blocks that were freed due to COW. Overall, our metadata caching is
much faster now.

We've integrated my patch for metadata bigger than the page size.
This allows metadata blocks up to 64KB in size. In practice 16K and
32K seem to work best. For workloads with lots of metadata, this cuts
down the size of the extent allocation tree dramatically and fragments
much less.

Scrub was updated to support the larger block sizes, which ended up
being a fairly large change (thanks Stefan Behrens).

We also have an assortment of fixes and updates, especially to the
balancing code (Ilya Dryomov), the back ref walker (Jan Schmidt) and
the defragging code (Liu Bo)."

Fixed up trivial conflicts in fs/btrfs/scrub.c that were just due to
removal of the second argument to k[un]map_atomic() in commit
7ac687d9e047.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (75 commits)
Btrfs: update the checks for mixed block groups with big metadata blocks
Btrfs: update to the right index of defragment
Btrfs: do not bother to defrag an extent if it is a big real extent
Btrfs: add a check to decide if we should defrag the range
Btrfs: fix recursive defragment with autodefrag option
Btrfs: fix the mismatch of page->mapping
Btrfs: fix race between direct io and autodefrag
Btrfs: fix deadlock during allocating chunks
Btrfs: show useful info in space reservation tracepoint
Btrfs: don't use crc items bigger than 4KB
Btrfs: flush out and clean up any block device pages during mount
btrfs: disallow unequal data/metadata blocksize for mixed block groups
Btrfs: enhance superblock sanity checks
Btrfs: change scrub to support big blocks
Btrfs: minor cleanup in scrub
Btrfs: introduce common define for max number of mirrors
Btrfs: fix infinite loop in btrfs_shrink_device()
Btrfs: fix memory leak in resolver code
Btrfs: allow dup for data chunks in mixed mode
Btrfs: validate target profiles only if we are going to use them
...

Linus Torvalds
2012-03-31 03:44:29 +0800

29 Mar, 2012

1 commit

1d4284bd6 Merge branch 'error-handling' into for-linus ... Browse Code »

Conflicts:
fs/btrfs/ctree.c
fs/btrfs/disk-io.c
fs/btrfs/extent-tree.c
fs/btrfs/extent_io.c
fs/btrfs/extent_io.h
fs/btrfs/inode.c
fs/btrfs/scrub.c

Signed-off-by: Chris Mason

Chris Mason
2012-03-29 08:31:37 +0800

27 Mar, 2012

8 commits

ea4667940 Btrfs: deal with read errors on extent buffers differently ... Browse Code »

Since we need to read and write extent buffers in their entirety we can't use
the normal bio_readpage_error stuff since it only works on a per page basis. So
instead make it so that if we see an io error in endio we just mark the eb as
having an IO error and then in btree_read_extent_buffer_pages we will manually
try other mirrors and then overwrite the bad mirror if we find a good copy.
This works with larger than page size blocks. Thanks,

Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Josef Bacik
2012-03-27 09:57:36 +0800
a098d8e8e Btrfs: loop waiting on writeback ... Browse Code »

lock_extent_buffer_for_io needs to loop around and make sure the
writeback bits are not set.

Signed-off-by: Chris Mason

Chris Mason
2012-03-27 05:04:23 +0800
0b32f4bbb Btrfs: ensure an entire eb is written at once ... Browse Code »

This patch simplifies how we track our extent buffers. Previously we could exit
writepages with only having written half of an extent buffer, which meant we had
to track the state of the pages and the state of the extent buffers differently.
Now we only read in entire extent buffers and write out entire extent buffers,
this allows us to simply set bits in our bflags to indicate the state of the eb
and we no longer have to do things like track uptodate with our iotree. Thanks,

Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Josef Bacik
2012-03-27 05:04:23 +0800
5df4235ea Btrfs: introduce mark_extent_buffer_accessed ... Browse Code »

Because an eb can have multiple pages we need to make sure that all pages within
the eb are markes as accessed, since releasepage can be called against any page
in the eb. This will keep us from possibly evicting hot eb's when we're doing
larger than pagesize eb's. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2012-03-27 04:51:09 +0800
3083ee2e1 Btrfs: introduce free_extent_buffer_stale ... Browse Code »

Because btrfs cow's we can end up with extent buffers that are no longer
necessary just sitting around in memory. So instead of evicting these pages, we
could end up evicting things we actually care about. Thus we have
free_extent_buffer_stale for use when we are freeing tree blocks. This will
make it so that the ref for the eb being in the radix tree is dropped as soon as
possible and then is freed when the refcount hits 0 instead of waiting to be
released by releasepage. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2012-03-27 04:51:08 +0800
115391d23 Btrfs: only use the existing eb if it's count isn't 0 ... Browse Code »

We can run into a problem where we find an eb for our existing page already on
the radix tree but it has a ref count of 0. It hasn't yet been removed by RCU
yet so this can cause issues where we will use the EB after free. So do
atomic_inc_not_zero on the exists->refs and if it is zero just do
synchronize_rcu() and try again. We won't have to worry about new allocators
coming in since they will block on the page lock at this point. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2012-03-27 04:51:08 +0800
4f2de97ac Btrfs: set page->private to the eb ... Browse Code »

We spend a lot of time looking up extent buffers from pages when we could just
store the pointer to the eb the page is associated with in page->private. This
patch does just that, and it makes things a little simpler and reduces a bit of
CPU overhead involved with doing metadata IO. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2012-03-27 04:51:07 +0800
727011e07 Btrfs: allow metadata blocks larger than the page size ... Browse Code »

A few years ago the btrfs code to support blocks lager than
the page size was disabled to fix a few corner cases in the
page cache handling. This fixes the code to properly support
large metadata blocks again.

Since current kernels will crash early and often with larger
metadata blocks, this adds an incompat bit so that older kernels
can't mount it.

This also does away with different blocksizes for nodes and leaves.
You get a single block size for all tree blocks.

Signed-off-by: Chris Mason

Chris Mason
2012-03-27 04:50:37 +0800

22 Mar, 2012

1 commit

79787eaab btrfs: replace many BUG_ONs with proper error handling ... Browse Code »

btrfs currently handles most errors with BUG_ON. This patch is a work-in-
progress but aims to handle most errors other than internal logic
errors and ENOMEM more gracefully.

This iteration prevents most crashes but can run into lockups with
the page lock on occasion when the timing "works out."

Signed-off-by: Jeff Mahoney

Jeff Mahoney
2012-03-22 18:52:54 +0800