Eric Lee / smarc-fsl-linux-kernel

04 Jan, 2012

2 commits

2a79f17e4 vfs: mnt_drop_write_file() ... Browse Code »

new helper (wrapper around mnt_drop_write()) to be used in pair with
mnt_want_write_file().

Signed-off-by: Al Viro

Al Viro
2012-01-04 11:52:40 +0800
a561be710 switch a bunch of places to mnt_want_write_file() ... Browse Code »

it's both faster (in case when file has been opened for write) and cleaner.

Signed-off-by: Al Viro

Al Viro
2012-01-04 11:52:35 +0800

16 Dec, 2011

2 commits

567a45e91 Merge branch 'for-chris' of http://git.kernel.org/pub/scm/linux/kernel/git/josef… ... Browse Code »

…/btrfs-work into integration

Conflicts:
fs/btrfs/inode.c

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Chris Mason
2011-12-16 02:43:49 +0800
660d3f6cd Btrfs: fix how we do delalloc reservations and how we free reservations on error ... Browse Code »

Running xfstests 269 with some tracing my scripts kept spitting out errors about
releasing bytes that we didn't actually have reserved. This took me down a huge
rabbit hole and it turns out the way we deal with reserved_extents is wrong,
we need to only be setting it if the reservation succeeds, otherwise the free()
method will come in and unreserve space that isn't actually reserved yet, which
can lead to other warnings and such. The math was all working out right in the
end, but it caused all sorts of other issues in addition to making my scripts
yell and scream and generally make it impossible for me to track down the
original issue I was looking for. The other problem is with our error handling
in the reservation code. There are two cases that we need to deal with

1) We raced with free. In this case free won't free anything because csum_bytes
is modified before we dro the lock in our reservation path, so free rightly
doesn't release any space because the reservation code may be depending on that
reservation. However if we fail, we need the reservation side to do the free at
that point since that space is no longer in use. So as it stands the code was
doing this fine and it worked out, except in case #2

2) We don't race with free. Nobody comes in and changes anything, and our
reservation fails. In this case we didn't reserve anything anyway and we just
need to clean up csum_bytes but not free anything. So we keep track of
csum_bytes before we drop the lock and if it hasn't changed we know we can just
decrement csum_bytes and carry on.

Because of the case where we can race with free()'s since we have to drop our
spin_lock to do the reservation, I'm going to serialize all reservations with
the i_mutex. We already get this for free in the heavy use paths, truncate and
file write all hold the i_mutex, just needed to add it to page_mkwrite and
various ioctl/balance things. With this patch my space leak scripts no longer
scream bloody murder. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2011-12-16 00:04:22 +0800

15 Dec, 2011

1 commit

306424cc8 Btrfs: fix ctime update of on-disk inode ... Browse Code »

To reproduce the bug:

# touch /mnt/tmp
# stat /mnt/tmp | grep Change
Change: 2011-12-09 09:32:23.412105981 +0800
# chattr +i /mnt/tmp
# stat /mnt/tmp | grep Change
Change: 2011-12-09 09:32:43.198105295 +0800
# umount /mnt
# mount /dev/loop1 /mnt
# stat /mnt/tmp | grep Change
Change: 2011-12-09 09:32:23.412105981 +0800

We should update ctime of in-memory inode before calling
btrfs_update_inode().

Signed-off-by: Li Zefan
Signed-off-by: Chris Mason

Li Zefan
2011-12-15 23:50:37 +0800

01 Dec, 2011

1 commit

ece7d20e8 Btrfs: Don't error on resizing FS to same size ... Browse Code »

It seems overly harsh to fail a resize of a btrfs file system to the
same size when a shrink or grow would succeed. User app GParted trips
over this error. Allow it by bypassing the shrink or grow operation.

Signed-off-by: Mike Fleetwood

Mike Fleetwood
2011-12-01 01:46:04 +0800

20 Nov, 2011

2 commits

5bb146823 Btrfs: prefix resize related printks with btrfs: ... Browse Code »

For the user it is confusing to find something like:
[10197.627710] new size for /dev/mapper/vg0-usr_share is 3221225472
in kernel log, because it doesn't point directly to btrfs.

This patch prefixes those messages with "btrfs:" like other btrfs
related printks.

Signed-off-by: Arnd Hannemann
Signed-off-by: Chris Mason

Arnd Hannemann
2011-11-20 20:42:16 +0800
745c4d8e1 btrfs: Fix up 32/64-bit compatibility for new ioctls ... Browse Code »

This patch casts to unsigned long before casting to a pointer and fixes
the following warnings:
fs/btrfs/extent_io.c:2289:20: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
fs/btrfs/ioctl.c:2933:37: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
fs/btrfs/ioctl.c:2937:21: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
fs/btrfs/ioctl.c:3020:21: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
fs/btrfs/scrub.c:275:4: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
fs/btrfs/backref.c:686:27: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]

Signed-off-by: Jeff Mahoney
Signed-off-by: Chris Mason

Jeff Mahoney
2011-11-20 20:42:13 +0800

06 Nov, 2011

3 commits

740c3d226 Btrfs: fix the new inspection ioctls for 32 bit compat ... Browse Code »

The new ioctls to follow backrefs are not clean for 32/64 bit
compat. This reworks them for u64s everywhere. They are brand new, so
there are no problems with changing the interface now.

Signed-off-by: Chris Mason

Chris Mason
2011-11-06 16:08:49 +0800
806468f8b Merge git://git.jan-o-sch.net/btrfs-unstable into integration ... Browse Code »

Conflicts:
fs/btrfs/Makefile
fs/btrfs/extent_io.c
fs/btrfs/extent_io.h
fs/btrfs/scrub.c

Signed-off-by: Chris Mason

Chris Mason
2011-11-06 16:07:10 +0800
6c41761fc btrfs: separate superblock items out of fs_info ... Browse Code »

fs_info has now ~9kb, more than fits into one page. This will cause
mount failure when memory is too fragmented. Top space consumers are
super block structures super_copy and super_for_commit, ~2.8kb each.
Allocate them dynamically. fs_info will be ~3.5kb. (measured on x86_64)

Add a wrapper for freeing fs_info and all of it's dynamically allocated
members.

Signed-off-by: David Sterba

David Sterba
2011-11-06 16:04:01 +0800

24 Oct, 2011

2 commits

a81d3b1ba Merge branch 'hotfixes-20111024/josef/for-chris' into btrfs-next-stable Browse Code »

David Sterba
2011-10-24 20:47:58 +0800
afd582ac8 Merge remote-tracking branch 'remotes/josef/for-chris' into btrfs-next-stable Browse Code »

David Sterba
2011-10-24 20:47:57 +0800

21 Oct, 2011

6 commits

f4c697e64 btrfs: return EINVAL if start > total_bytes in fitrim ioctl ... Browse Code »

We should retirn EINVAL if the start is beyond the end of the file
system in the btrfs_ioctl_fitrim(). Fix that by adding the appropriate
check for it.

Also in the btrfs_trim_fs() it is possible that len+start might overflow
if big values are passed. Fix it by decrementing the len so that start+len
is equal to the file system size in the worst case.

Signed-off-by: Lukas Czerner

Lukas Czerner
2011-10-21 00:10:40 +0800
008873eaf Btrfs: honor extent thresh during defragmentation ... Browse Code »

We won't defrag an extent, if it's bigger than the threshold we
specified and there's no small extent before it, but actually
the code doesn't work this way.

There are three bugs:

- When should_defrag_range() decides we should keep on defragmenting
an extent, last_len is not incremented. (old bug)

- The length that passes to should_defrag_range() is not the length
we're going to defrag. (new bug)

- We always defrag 256K bytes data, and a big extent can be part of
this range. (new bug)

For a file with 4 extents:

| 4K | 4K | 256K | 256K |

The result of defrag with (the default) 256K extent thresh should be:

| 264K | 256K |

but with those bugs, we'll get:

| 520K |

Signed-off-by: Li Zefan

Li Zefan
2011-10-21 00:10:39 +0800
5ca496604 Btrfs: fix wrong max_to_defrag in btrfs_defrag_file() ... Browse Code »

It's off-by-one, and thus we may skip the last page while defragmenting.

An example case:

# create /mnt/file with 2 4K file extents
# btrfs fi defrag /mnt/file
# sync
# filefrag /mnt/file
/mnt/file: 2 extents found

So it's not defragmented.

Signed-off-by: Li Zefan

Li Zefan
2011-10-21 00:10:37 +0800
151a31b25 Btrfs: use i_size_read() in btrfs_defrag_file() ... Browse Code »

Don't use inode->i_size directly, since we're not holding i_mutex.

This also fixes another bug, that i_size can change after it's checked
against 0 and then (i_size - 1) can be negative.

Signed-off-by: Li Zefan

Li Zefan
2011-10-21 00:10:35 +0800
cbcc83265 Btrfs: fix defragmentation regression ... Browse Code »

There's an off-by-one bug:

# create a file with lots of 4K file extents
# btrfs fi defrag /mnt/file
# sync
# filefrag -v /mnt/file
Filesystem type is: 9123683e
File size of /mnt/file is 1228800 (300 blocks, blocksize 4096)
ext logical physical expected length flags
0 0 3372 64
1 64 3136 3435 1
2 65 3436 3136 64
3 129 3201 3499 1
4 130 3500 3201 64
5 194 3266 3563 1
6 195 3564 3266 64
7 259 3331 3627 1
8 260 3628 3331 40 eof

After this patch:

...
# filefrag -v /mnt/file
Filesystem type is: 9123683e
File size of /mnt/file is 1228800 (300 blocks, blocksize 4096)
ext logical physical expected length flags
0 0 3372 300 eof
/mnt/file: 1 extent found

Signed-off-by: Li Zefan

Li Zefan
2011-10-21 00:10:34 +0800
60ccf82f5 btrfs: fix memory leak in btrfs_defrag_file ... Browse Code »

kmemleak found this:
unreferenced object 0xffff8801b64af968 (size 512):
comm "btrfs-cleaner", pid 3317, jiffies 4306810886 (age 903.272s)
hex dump (first 32 bytes):
00 82 01 07 00 ea ff ff c0 83 01 07 00 ea ff ff ................
80 82 01 07 00 ea ff ff c0 87 01 07 00 ea ff ff ................
backtrace:
[] kmemleak_alloc+0x5c/0xc0
[] kmem_cache_alloc_trace+0x163/0x240
[] btrfs_defrag_file+0xf0/0xb20
[] btrfs_run_defrag_inodes+0x165/0x210
[] cleaner_kthread+0x177/0x190
[] kthread+0x8d/0xa0
[] kernel_thread_helper+0x4/0x10
[] 0xffffffffffffffff

"pages" is not always freed. Fix it removing the unnecesary additional return.

Signed-off-by: Diego Calleja

Diego Calleja
2011-10-21 00:10:33 +0800

20 Oct, 2011

2 commits

e27425d61 Btrfs: only inherit btrfs specific flags when creating files ... Browse Code »

Xfstests 79 was failing because we were inheriting the S_APPEND flag when we
weren't supposed to. There isn't any specific documentation on this so I'm
taking the test as the standard of how things work, and having S_APPEND set on a
directory doesn't mean that S_APPEND gets inherited by its children according to
this test. So only inherit btrfs specific things. This will let us set
compress/nocompress on specific directories and everything in the directories
will inherit this flag, same with nodatacow. With this patch test 79 passes.
Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2011-10-20 03:12:50 +0800
3b16a4e3c Btrfs: use the inode's mapping mask for allocating pages ... Browse Code »

Johannes pointed out we were allocating only kernel pages for doing writes,
which is kind of a big deal if you are on 32bit and have more than a gig of ram.
So fix our allocations to use the mapping's gfp but still clear __GFP_FS so we
don't re-enter. Thanks,

Reported-by: Johannes Weiner
Signed-off-by: Josef Bacik

Josef Bacik
2011-10-20 03:12:45 +0800

13 Oct, 2011

1 commit

b2f9452bd Merge branch 'btrfs-3.0' of git://github.com/chrismason/linux ... Browse Code »

* 'btrfs-3.0' of git://github.com/chrismason/linux:
Btrfs: make sure not to defrag extents past i_size
Btrfs: fix recursive auto-defrag

Linus Torvalds
2011-10-13 14:20:40 +0800

11 Oct, 2011

2 commits

f7f43cc84 Btrfs: make sure not to defrag extents past i_size ... Browse Code »

The btrfs file defrag code will loop through the extents and
force COW on them. But there is a concurrent truncate in the middle of
the defrag, it might end up defragging the same range over and over
again.

The problem is that writepage won't go through and do anything on pages
past i_size, so the cow won't happen, so the file will appear to still
be fragmented. defrag will end up hitting the same extents again and
again.

In the worst case, the truncate can actually live lock with the defrag
because the defrag keeps creating new ordered extents which the truncate
code keeps waiting on.

The fix here is to make defrag check for i_size inside the main loop,
instead of just once before the looping starts.

Signed-off-by: Chris Mason

Chris Mason
2011-10-11 23:45:55 +0800
2a0f7f576 Btrfs: fix recursive auto-defrag ... Browse Code »

Follow those steps:

# mount -o autodefrag /dev/sda7 /mnt
# dd if=/dev/urandom of=/mnt/tmp bs=200K count=1
# sync
# dd if=/dev/urandom of=/mnt/tmp bs=8K count=1 conv=notrunc

and then it'll go into a loop: writeback -> defrag -> writeback ...

It's because writeback writes [8K, 200K] and then writes [0, 8K].

I tried to make writeback know if the pages are dirtied by defrag,
but the patch was a bit intrusive. Here I simply set writeback_index
when we defrag a file.

Signed-off-by: Li Zefan
Signed-off-by: Chris Mason

Li Zefan
2011-10-11 03:43:34 +0800

29 Sep, 2011

1 commit

d7728c960 btrfs: new ioctls to do logical->inode and inode->path resolving ... Browse Code »

these ioctls make use of the new functions initially added for scrub. they
return all inodes belonging to a logical address (BTRFS_IOC_LOGICAL_INO) and
all paths belonging to an inode (BTRFS_IOC_INO_PATHS).

Signed-off-by: Jan Schmidt

Jan Schmidt
2011-09-29 18:54:28 +0800

21 Sep, 2011

2 commits

0a7a0519d Merge branch 'btrfs-3.0' into for-linus Browse Code »

Chris Mason
2011-09-21 02:49:29 +0800
b6f3409b2 Btrfs: reserve sufficient space for ioctl clone ... Browse Code »

Fix a crash/BUG_ON in the clone ioctl due to insufficient reservation. We
need to reserve space for:

- adjusting the old extent (possibly splitting it)
- adding the new extent
- updating the inode

Signed-off-by: Sage Weil
Signed-off-by: Chris Mason

Sage Weil
2011-09-21 02:48:51 +0800

18 Sep, 2011

4 commits

2cf4ce7c2 Merge branch 'btrfs-3.0' into for-linus Browse Code »

Chris Mason
2011-09-18 22:31:44 +0800
dde820fbf Btrfs: don't change inode flag of the dest clone file ... Browse Code »

The dst file will have the same inode flags with dst file after
file clone, and I think it's unexpected.

For example, the dst file will suddenly become immutable after
getting some share of data with src file, if the src is immutable.

Signed-off-by: Li Zefan
Signed-off-by: Chris Mason

Li Zefan
2011-09-18 22:20:46 +0800
0e7b824c4 Btrfs: don't make a file partly checksummed through file clone ... Browse Code »

To reproduce the bug:

# mount /dev/sda7 /mnt
# dd if=/dev/zero of=/mnt/src bs=4K count=1
# umount /mnt

# mount -o nodatasum /dev/sda7 /mnt
# dd if=/dev/zero of=/mnt/dst bs=4K count=1
# clone_range -s 4K -l 4K /mnt/src /mnt/dst

# echo 3 > /proc/sys/vm/drop_caches
# cat /mnt/dst
# dmesg
...
btrfs no csum found for inode 258 start 0
btrfs csum failed ino 258 off 0 csum 2566472073 private 0

It's because part of the file is checksummed and the other part is not,
and then btrfs will complain checksum is not found when we read the file.

Disallow file clone if src and dst file have different checksum flag,
so we ensure a file is completely checksummed or unchecksummed.

Signed-off-by: Li Zefan
Signed-off-by: Chris Mason

Li Zefan
2011-09-18 22:20:46 +0800
71ef07861 Btrfs: fix pages truncation in btrfs_ioctl_clone() ... Browse Code »

It's a bug in commit f81c9cdc567cd3160ff9e64868d9a1a7ee226480
(Btrfs: truncate pages from clone ioctl target range)

We should pass the dest range to the truncate function, but not the
src range.

Also move the function before locking extent state.

Signed-off-by: Li Zefan
Signed-off-by: Chris Mason

Li Zefan
2011-09-18 22:20:46 +0800

13 Sep, 2011

1 commit

0b001b2ed Merge branch 'for-linus' of git://github.com/chrismason/linux ... Browse Code »

* 'for-linus' of git://github.com/chrismason/linux:
Btrfs: add dummy extent if dst offset excceeds file end in
Btrfs: calc file extent num_bytes correctly in file clone
btrfs: xattr: fix attribute removal
Btrfs: fix wrong nbytes information of the inode
Btrfs: fix the file extent gap when doing direct IO
Btrfs: fix unclosed transaction handle in btrfs_cont_expand
Btrfs: fix misuse of trans block rsv
Btrfs: reset to appropriate block rsv after orphan operations
Btrfs: skip locking if searching the commit root in csum lookup
btrfs: fix warning in iput for bad-inode
Btrfs: fix an oops when deleting snapshots

Linus Torvalds
2011-09-13 02:47:49 +0800

11 Sep, 2011

2 commits

d525e8ab0 Btrfs: add dummy extent if dst offset excceeds file end in ... Browse Code »

You can see there's no file extent with range [0, 4096]. Check this by
btrfsck:

# btrfsck /dev/sda7
root 5 inode 258 errors 100
...

Signed-off-by: Li Zefan
Signed-off-by: Chris Mason

Li Zefan
2011-09-11 22:52:25 +0800
d72c0842f Btrfs: calc file extent num_bytes correctly in file clone ... Browse Code »

num_bytes should be 4096 not 12288.

Signed-off-by: Li Zefan
Signed-off-by: Chris Mason

Li Zefan
2011-09-11 22:52:25 +0800

18 Aug, 2011

1 commit

81d86e1b7 Merge branch 'btrfs-3.0' into for-linus Browse Code »

Chris Mason
2011-08-18 22:38:03 +0800

17 Aug, 2011

1 commit

f81c9cdc5 Btrfs: truncate pages from clone ioctl target range ... Browse Code »

We need to truncate page cache pages for the clone ioctl target range or
else we'll confuse ourselves to no end. If the old data was cached, we
used to still see it (until remount). If the page was partially updated
we used to get a mix of old and new data.

Signed-off-by: Sage Weil
Signed-off-by: Chris Mason

Sage Weil
2011-08-17 09:09:31 +0800

03 Aug, 2011

1 commit

ed8f37370 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (31 commits)
Btrfs: don't call writepages from within write_full_page
Btrfs: Remove unused variable 'last_index' in file.c
Btrfs: clean up for find_first_extent_bit()
Btrfs: clean up for wait_extent_bit()
Btrfs: clean up for insert_state()
Btrfs: remove unused members from struct extent_state
Btrfs: clean up code for merging extent maps
Btrfs: clean up code for extent_map lookup
Btrfs: clean up search_extent_mapping()
Btrfs: remove redundant code for dir item lookup
Btrfs: make acl functions really no-op if acl is not enabled
Btrfs: remove remaining ref-cache code
Btrfs: remove a BUG_ON() in btrfs_commit_transaction()
Btrfs: use wait_event()
Btrfs: check the nodatasum flag when writing compressed files
Btrfs: copy string correctly in INO_LOOKUP ioctl
Btrfs: don't print the leaf if we had an error
btrfs: make btrfs_set_root_node void
Btrfs: fix oops while writing data to SSD partitions
Btrfs: Protect the readonly flag of block group
...

Fix up trivial conflicts (due to acl and writeback cleanups) in
- fs/btrfs/acl.c
- fs/btrfs/ctree.h
- fs/btrfs/extent_io.c

Linus Torvalds
2011-08-03 15:14:05 +0800

02 Aug, 2011

1 commit

77906a507 Btrfs: copy string correctly in INO_LOOKUP ioctl ... Browse Code »

Memory areas [ptr, ptr+total_len] and [name, name+total_len]
may overlap, so it's wrong to use memcpy().

Signed-off-by: Li Zefan
Signed-off-by: Chris Mason

Li Zefan
2011-08-02 02:30:45 +0800

28 Jul, 2011

2 commits

22712200e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: make sure reserve_metadata_bytes doesn't leak out strange errors
Btrfs: use the commit_root for reading free_space_inode crcs
Btrfs: reduce extent_state lock contention for metadata
Btrfs: remove lockdep magic from btrfs_next_leaf
Btrfs: make a lockdep class for each root
Btrfs: switch the btrfs tree locks to reader/writer
Btrfs: fix deadlock when throttling transactions
Btrfs: stop using highmem for extent_buffers
Btrfs: fix BUG_ON() caused by ENOSPC when relocating space
Btrfs: tag pages for writeback in sync
Btrfs: fix enospc problems with delalloc
Btrfs: don't flush delalloc arbitrarily
Btrfs: use find_or_create_page instead of grab_cache_page
Btrfs: use a worker thread to do caching
Btrfs: fix how we merge extent states and deal with cached states
Btrfs: use the normal checksumming infrastructure for free space cache
Btrfs: serialize flushers in reserve_metadata_bytes
Btrfs: do transaction space reservation before joining the transaction
Btrfs: try to only do one btrfs_search_slot in do_setxattr

Linus Torvalds
2011-07-28 07:43:52 +0800
9e0baf60d Btrfs: fix enospc problems with delalloc ... Browse Code »

So I had this brilliant idea to use atomic counters for outstanding and reserved
extents, but this turned out to be a bad idea. Consider this where we have 1
outstanding extent and 1 reserved extent

Reserver Releaser
atomic_dec(outstanding) now 0
atomic_read(outstanding)+1 get 1
atomic_read(reserved) get 1
don't actually reserve anything because
they are the same
atomic_cmpxchg(reserved, 1, 0)
atomic_inc(outstanding)
atomic_add(0, reserved)
free reserved space for 1 extent

Then the reserver now has no actual space reserved for it, and when it goes to
finish the ordered IO it won't have enough space to do it's allocation and you
get those lovely warnings.

Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Josef Bacik
2011-07-28 00:46:44 +0800