Doug / smarc-fsl-linux-kernel | Embedian Git Server

24 Jul, 2012

1 commit

a66d2c8f7 Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull the big VFS changes from Al Viro:
"This one is *big* and changes quite a few things around VFS. What's in there:

- the first of two really major architecture changes - death to open
intents.

The former is finally there; it was very long in making, but with
Miklos getting through really hard and messy final push in
fs/namei.c, we finally have it. Unlike his variant, this one
doesn't introduce struct opendata; what we have instead is
->atomic_open() taking preallocated struct file * and passing
everything via its fields.

Instead of returning struct file *, it returns -E... on error, 0
on success and 1 in "deal with it yourself" case (e.g. symlink
found on server, etc.).

See comments before fs/namei.c:atomic_open(). That made a lot of
goodies finally possible and quite a few are in that pile:
->lookup(), ->d_revalidate() and ->create() do not get struct
nameidata * anymore; ->lookup() and ->d_revalidate() get lookup
flags instead, ->create() gets "do we want it exclusive" flag.

With the introduction of new helper (kern_path_locked()) we are rid
of all struct nameidata instances outside of fs/namei.c; it's still
visible in namei.h, but not for long. Come the next cycle,
declaration will move either to fs/internal.h or to fs/namei.c
itself. [me, miklos, hch]

- The second major change: behaviour of final fput(). Now we have
__fput() done without any locks held by caller *and* not from deep
in call stack.

That obviously lifts a lot of constraints on the locking in there.
Moreover, it's legal now to call fput() from atomic contexts (which
has immediately simplified life for aio.c). We also don't need
anti-recursion logics in __scm_destroy() anymore.

There is a price, though - the damn thing has become partially
asynchronous. For fput() from normal process we are guaranteed
that pending __fput() will be done before the caller returns to
userland, exits or gets stopped for ptrace.

For kernel threads and atomic contexts it's done via
schedule_work(), so theoretically we might need a way to make sure
it's finished; so far only one such place had been found, but there
might be more.

There's flush_delayed_fput() (do all pending __fput()) and there's
__fput_sync() (fput() analog doing __fput() immediately). I hope
we won't need them often; see warnings in fs/file_table.c for
details. [me, based on task_work series from Oleg merged last
cycle]

- sync series from Jan

- large part of "death to sync_supers()" work from Artem; the only
bits missing here are exofs and ext4 ones. As far as I understand,
those are going via the exofs and ext4 trees resp.; once they are
in, we can put ->write_super() to the rest, along with the thread
calling it.

- preparatory bits from unionmount series (from dhowells).

- assorted cleanups and fixes all over the place, as usual.

This is not the last pile for this cycle; there's at least jlayton's
ESTALE work and fsfreeze series (the latter - in dire need of fixes,
so I'm not sure it'll make the cut this cycle). I'll probably throw
symlink/hardlink restrictions stuff from Kees into the next pile, too.
Plus there's a lot of misc patches I hadn't thrown into that one -
it's large enough as it is..."

* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (127 commits)
ext4: switch EXT4_IOC_RESIZE_FS to mnt_want_write_file()
btrfs: switch btrfs_ioctl_balance() to mnt_want_write_file()
switch dentry_open() to struct path, make it grab references itself
spufs: shift dget/mntget towards dentry_open()
zoran: don't bother with struct file * in zoran_map
ecryptfs: don't reinvent the wheels, please - use struct completion
don't expose I_NEW inodes via dentry->d_inode
tidy up namei.c a bit
unobfuscate follow_up() a bit
ext3: pass custom EOF to generic_file_llseek_size()
ext4: use core vfs llseek code for dir seeks
vfs: allow custom EOF in generic_file_llseek code
vfs: Avoid unnecessary WB_SYNC_NONE writeback during sys_sync and reorder sync passes
vfs: Remove unnecessary flushing of block devices
vfs: Make sys_sync writeout also block device inodes
vfs: Create function for iterating over block devices
vfs: Reorder operations during sys_sync
quota: Move quota syncing to ->sync_fs method
quota: Split dquot_quota_sync() to writeback and cache flushing part
vfs: Move noop_backing_dev_info check from sync into writeback
...

Linus Torvalds
2012-07-24 03:27:27 +0800

23 Jul, 2012

1 commit

765927b2d switch dentry_open() to struct path, make it grab references itself ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-07-23 04:01:29 +0800

14 Jul, 2012

6 commits

ebfc3b49a don't pass nameidata to ->create() ... Browse Code »

boolean "does it have to be exclusive?" flag is passed instead;
Local filesystem should just ignore it - the object is guaranteed
not to be there yet.

Signed-off-by: Al Viro

Al Viro
2012-07-14 20:34:47 +0800
00cd8dd3b stop passing nameidata to ->lookup() ... Browse Code »

Just the flags; only NFS cares even about that, but there are
legitimate uses for such argument. And getting rid of that
completely would require splitting ->lookup() into a couple
of methods (at least), so let's leave that alone for now...

Signed-off-by: Al Viro

Al Viro
2012-07-14 20:34:32 +0800
1632dcc93 xfs: do not call xfs_bdstrat_cb in xfs_buf_iodone_callbacks ... Browse Code »

xfs_bdstrat_cb only adds a check for a shutdown filesystem over
xfs_buf_iorequest, but xfs_buf_iodone_callbacks just checked for a shut down
filesystem a little earlier. In addition the shutdown handling in
xfs_bdstrat_cb is not very suitable for this caller.

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Ben Myers

Christoph Hellwig
2012-07-14 02:09:49 +0800
40a9b7963 xfs: prevent recursion in xfs_buf_iorequest ... Browse Code »

If the b_iodone handler is run in calling context in xfs_buf_iorequest we
can run into a recursion where xfs_buf_iodone_callbacks keeps calling back
into xfs_buf_iorequest because an I/O error happened, which keeps calling
back into xfs_buf_iorequest. This chain will usually not take long
because the filesystem gets shut down because of log I/O errors, but even
over a short time it can cause stack overflows if run on the same context.

As a short term workaround make sure we always call the iodone handler in
workqueue context.

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Ben Myers

Christoph Hellwig
2012-07-14 02:09:39 +0800
aa292847b xfs: don't defer metadata allocation to the workqueue ... Browse Code »

Almost all metadata allocations come from shallow stack usage
situations. Avoid the overhead of switching the allocation to a
workqueue as we are not in danger of running out of stack when
making these allocations. Metadata allocations are already marked
through the args that are passed down, so this is trivial to do.

Signed-off-by: Dave Chinner
Reported-by: Mel Gorman
Tested-by: Mel Gorman
Signed-off-by: Ben Myers

Dave Chinner
2012-07-14 02:09:27 +0800
e3a746f5a xfs: really fix the cursor leak in xfs_alloc_ag_vextent_near ... Browse Code »

The current cursor is reallocated when retrying the allocation, so
the existing cursor needs to be destroyed in both the restart and
the failure cases.

Signed-off-by: Dave Chinner
Tested-by: Mike Snitzer
Signed-off-by: Ben Myers

Dave Chinner
2012-07-14 02:09:18 +0800

22 Jun, 2012

5 commits

f7bdf03a9 xfs: rename log structure to xlog ... Browse Code »

Rename the XFS log structure to xlog to help crash distinquish it from the
other logs in Linux.

Signed-off-by: Mark Tinguely
Reviewed-by: Christoph Hellwig
Signed-off-by: Ben Myers

Mark Tinguely
2012-06-22 03:21:11 +0800
8866fc6fa xfs: shutdown xfs_sync_worker before the log ... Browse Code »

Revert commit 1307bbd, which uses the s_umount semaphore to provide
exclusion between xfs_sync_worker and unmount, in favor of shutting down
the sync worker before freeing the log in xfs_log_unmount. This is a
cleaner way of resolving the race between xfs_sync_worker and unmount
than using s_umount.

Signed-off-by: Ben Myers
Reviewed-by: Mark Tinguely
Reviewed-by: Dave Chinner

Ben Myers
2012-06-22 03:20:48 +0800
59c84ed0d xfs: Fix overallocation in xfs_buf_allocate_memory() ... Browse Code »

Commit de1cbee which removed b_file_offset in favor of b_bn introduced a bug
causing xfs_buf_allocate_memory() to overestimate the number of necessary
pages. The problem is that xfs_buf_alloc() sets b_bn to -1 and thus effectively
every buffer is straddling a page boundary which causes
xfs_buf_allocate_memory() to allocate two pages and use vmalloc() for access
which is unnecessary.

Dave says xfs_buf_alloc() doesn't need to set b_bn to -1 anymore since the
buffer is inserted into the cache only after being fully initialized now.
So just make xfs_buf_alloc() fill in proper block number from the beginning.

CC: David Chinner
Signed-off-by: Jan Kara
Reviewed-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Ben Myers

Jan Kara
2012-06-22 03:20:36 +0800
76d095388 xfs: fix allocbt cursor leak in xfs_alloc_ag_vextent_near ... Browse Code »

When we fail to find an matching extent near the requested extent
specification during a left-right distance search in
xfs_alloc_ag_vextent_near, we fail to free the original cursor that
we used to look up the XFS_BTNUM_CNT tree and hence leak it.

Reported-by: Chris J Arges
Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Signed-off-by: Ben Myers

Dave Chinner
2012-06-22 03:20:20 +0800
9a3a5dab6 xfs: check for stale inode before acquiring iflock on push ... Browse Code »

An inode in the AIL can be flush locked and marked stale if
a cluster free transaction occurs at the right time. The
inode item is then marked as flushing, which causes xfsaild
to spin and leaves the filesystem stalled. This is
reproduced by running xfstests 273 in a loop for an
extended period of time.

Check for stale inodes before the flush lock. This marks
the inode as pinned, leads to a log flush and allows the
filesystem to proceed.

Signed-off-by: Brian Foster
Reviewed-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Brian Foster
2012-06-22 03:20:06 +0800

21 Jun, 2012

2 commits

3b876c8f2 xfs: fix debug_object WARN at xfs_alloc_vextent() ... Browse Code »

Fengguang reports:

[ 780.529603] XFS (vdd): Ending clean mount
[ 781.454590] ODEBUG: object is on stack, but not annotated
[ 781.455433] ------------[ cut here ]------------
[ 781.455433] WARNING: at /c/kernel-tests/sound/lib/debugobjects.c:301 __debug_object_init+0x173/0x1f1()
[ 781.455433] Hardware name: Bochs
[ 781.455433] Modules linked in:
[ 781.455433] Pid: 26910, comm: kworker/0:2 Not tainted 3.4.0+ #51
[ 781.455433] Call Trace:
[ 781.455433] [] warn_slowpath_common+0x83/0x9b
[ 781.455433] [] warn_slowpath_null+0x1a/0x1c
[ 781.455433] [] __debug_object_init+0x173/0x1f1
[ 781.455433] [] debug_object_init+0x14/0x16
[ 781.455433] [] __init_work+0x20/0x22
[ 781.455433] [] xfs_alloc_vextent+0x6c/0xd5

Use INIT_WORK_ONSTACK in xfs_alloc_vextent instead of INIT_WORK.

Reported-by: Wu Fengguang
Signed-off-by: Jie Liu
Signed-off-by: Ben Myers

Jeff Liu
2012-06-21 03:58:24 +0800
66f931138 xfs: xfs_vm_writepage clear iomap_valid when !buffer_uptodate (REV2) ... Browse Code »

On filesytems with a block size smaller than PAGE_SIZE we currently have
a problem with unwritten extents. If a we have multi-block page for
which an unwritten extent has been allocated, and only some of the
buffers have been written to, and they are not contiguous, we can expose
stale data from disk in the blocks between the writes after extent
conversion.

Example of a page with unwritten and real data.
buffer content
0 empty b_state = 0
1 DATA b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten
2 DATA b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten
3 empty b_state = 0
4 empty b_state = 0
5 DATA b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten
6 DATA b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten
7 empty b_state = 0

Buffers 1, 2, 5, and 6 have been written to, leaving 0, 3, 4, and 7
empty. Currently buffers 1, 2, 5, and 6 are added to a single ioend,
and when IO has completed, extent conversion creates a real extent from
block 1 through block 6, leaving 0 and 7 unwritten. However buffers 3
and 4 were not written to disk, so stale data is exposed from those
blocks on a subsequent read.

Fix this by setting iomap_valid = 0 when we find a buffer that is not
Uptodate. This ensures that buffers 5 and 6 are not added to the same
ioend as buffers 1 and 2. Later these blocks will be converted into two
separate real extents, leaving the blocks in between unwritten.

Signed-off-by: Alain Renaud
Reviewed-by: Dave Chinner
Signed-off-by: Ben Myers

Alain Renaud
2012-06-21 03:57:28 +0800

02 Jun, 2012

1 commit

c3b2da314 fs: introduce inode operation ->update_time ... Browse Code »

Btrfs has to make sure we have space to allocate new blocks in order to modify
the inode, so updating time can fail. We've gotten around this by having our
own file_update_time but this is kind of a pain, and Christoph has indicated he
would like to make xfs do something different with atime updates. So introduce
->update_time, where we will deal with i_version an a/m/c time updates and
indicate which changes need to be made. The normal version just does what it
has always done, updates the time and marks the inode dirty, and then
filesystems can choose to do something different.

I've gone through all of the users of file_update_time and made them check for
errors with the exception of the fault code since it's complicated and I wasn't
quite sure what to do there, also Jan is going to be pushing the file time
updates into page_mkwrite for those who have it so that should satisfy btrfs and
make it not a big deal to check the file_update_time() return code in the
generic fault path. Thanks,

Signed-off-by: Josef Bacik

Josef Bacik
2012-06-02 00:07:25 +0800

30 May, 2012

2 commits

b0b0382bb ->encode_fh() API change ... Browse Code »

pass inode + parent's inode or NULL instead of dentry + bool saying
whether we want the parent or not.

NOTE: that needs ceph fix folded in.

Signed-off-by: Al Viro

Al Viro
2012-05-30 11:28:33 +0800
77ba78776 xfs: switch to proper __bitwise type for KM_... flags ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-05-30 11:28:32 +0800

29 May, 2012

1 commit

90324cc1b Merge tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux ... Browse Code »

Pull writeback tree from Wu Fengguang:
"Mainly from Jan Kara to avoid iput() in the flusher threads."

* tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
writeback: Avoid iput() from flusher thread
vfs: Rename end_writeback() to clear_inode()
vfs: Move waiting for inode writeback from end_writeback() to evict_inode()
writeback: Refactor writeback_single_inode()
writeback: Remove wb->list_lock from writeback_single_inode()
writeback: Separate inode requeueing after writeback
writeback: Move I_DIRTY_PAGES handling
writeback: Move requeueing when I_SYNC set to writeback_sb_inodes()
writeback: Move clearing of I_SYNC into inode_sync_complete()
writeback: initialize global_dirty_limit
fs: remove 8 bytes of padding from struct writeback_control on 64 bit builds
mm: page-writeback.c: local functions should not be exposed globally

Linus Torvalds
2012-05-29 00:54:45 +0800

21 May, 2012

3 commits

14c26c6a0 xfs: add trace points for log forces ... Browse Code »

To enable easy tracing of the location of log forces and the
frequency of them via perf, add a pair of trace points to the log
force functions. This will help debug where excessive log forces
are being issued from by simple perf commands like:

# ~/perf/perf top -e xfs:xfs_log_force -G -U

Which gives this sort of output:

Events: 141 xfs:xfs_log_force
- 100.00% [kernel] [k] xfs_log_force
- xfs_log_force
87.04% xfsaild
kthread
kernel_thread_helper
- 12.87% xfs_buf_lock
_xfs_buf_find
xfs_buf_get
xfs_trans_get_buf
xfs_da_do_buf
xfs_da_get_buf
xfs_dir2_data_init
xfs_dir2_leaf_addname
xfs_dir_createname
xfs_create
xfs_vn_mknod
xfs_vn_create
vfs_create
do_last.isra.41
path_openat
do_filp_open
do_sys_open
sys_open
system_call_fastpath

Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2012-05-21 23:45:44 +0800
3ba316037 xfs: fix memory reclaim deadlock on agi buffer ... Browse Code »

Note xfs_iget can be called while holding a locked agi buffer. If
it goes into memory reclaim then inode teardown may try to lock the
same buffer. Prevent the deadlock by calling radix_tree_preload
with GFP_NOFS.

Signed-off-by: Peter Watkins
Reviewed-by: Dave Chinner
Signed-off-by: Ben Myers

Peter Watkins
2012-05-21 23:45:44 +0800
ea562ed6e xfs: fix delalloc quota accounting on failure ... Browse Code »

xfstest 270 was causing quota reservations way beyond what was sane
(ten to hundreds of TB) for a 4GB filesystem. There's a sign problem
in the error handling path of xfs_bmapi_reserve_delalloc() because
xfs_trans_unreserve_quota_nblks() simple negates the value passed -
which doesn't work for an unsigned variable. This causes
reservations of close to 2^32 block instead of removing a
reservation of a handful of blocks.

Fix the same problem in the other xfs_trans_unreserve_quota_nblks()
callers where unsigned integer variables are used, too.

Signed-off-by: Dave Chinner
Reviewed-by: Eric Sandeen
Signed-off-by: Ben Myers

Dave Chinner
2012-05-21 23:45:43 +0800

16 May, 2012

1 commit

1307bbd2a xfs: protect xfs_sync_worker with s_umount semaphore ... Browse Code »

xfs_sync_worker checks the MS_ACTIVE flag in s_flags to avoid doing
work during mount and unmount. This flag can be cleared by unmount
after the xfs_sync_worker checks it but before the work is completed.
The has caused crashes in the completion handler for the dummy
transaction commited by xfs_sync_worker:

PID: 27544 TASK: ffff88013544e040 CPU: 3 COMMAND: "kworker/3:0"
#0 [ffff88016fdff930] machine_kexec at ffffffff810244e9
#1 [ffff88016fdff9a0] crash_kexec at ffffffff8108d053
#2 [ffff88016fdffa70] oops_end at ffffffff813ad1b8
#3 [ffff88016fdffaa0] no_context at ffffffff8102bd48
#4 [ffff88016fdffaf0] __bad_area_nosemaphore at ffffffff8102c04d
#5 [ffff88016fdffb40] bad_area_nosemaphore at ffffffff8102c12e
#6 [ffff88016fdffb50] do_page_fault at ffffffff813afaee
#7 [ffff88016fdffc60] page_fault at ffffffff813ac635
[exception RIP: xlog_get_lowest_lsn+0x30]
RIP: ffffffffa04a9910 RSP: ffff88016fdffd10 RFLAGS: 00010246
RAX: ffffc90014e48000 RBX: ffff88014d879980 RCX: ffff88014d879980
RDX: ffff8802214ee4c0 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff88016fdffd10 R8: ffff88014d879a80 R9: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffff8802214ee400
R13: ffff88014d879980 R14: 0000000000000000 R15: ffff88022fd96605
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#8 [ffff88016fdffd18] xlog_state_do_callback at ffffffffa04aa186 [xfs]
#9 [ffff88016fdffd98] xlog_state_done_syncing at ffffffffa04aa568 [xfs]

Protect xfs_sync_worker by using the s_umount semaphore at the read
level to provide exclusion with unmount while work is progressing.

Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Ben Myers
2012-05-16 03:35:43 +0800

15 May, 2012

17 commits

3fe3e6b18 xfs: introduce SEEK_DATA/SEEK_HOLE support ... Browse Code »

This patch adds lseek(2) SEEK_DATA/SEEK_HOLE functionality to xfs.

Signed-off-by: Jie Liu
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Jeff Liu
2012-05-15 05:21:05 +0800
e700a06c7 xfs: make xfs_extent_busy_trim not static ... Browse Code »

Commit e459df5, 'xfs: move busy extent handling to it's own file'
moved some code from xfs_alloc.c into xfs_extent_busy.c for
convenience in userspace code merges. One of the functions moved is
xfs_extent_busy_trim (formerly xfs_alloc_busy_trim) which is defined
STATIC. Unfortunately this function is still used in xfs_alloc.c, and
this results in an undefined symbol in xfs.ko.

Make xfs_extent_busy_trim not static and add its prototype to
xfs_extent_busy.h.

Signed-off-by: Ben Myers
Reviewed-by: Mark Tinguely

Ben Myers
2012-05-15 05:21:04 +0800
611c99468 xfs: make XBF_MAPPED the default behaviour ... Browse Code »

Rather than specifying XBF_MAPPED for almost all buffers, introduce
XBF_UNMAPPED for the couple of users that use unmapped buffers.

Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Reviewed-by: Christoph Hellwig
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:21:03 +0800
d4f3512b0 xfs: flush outstanding buffers on log mount failure ... Browse Code »

When we fail to mount the log in xfs_mountfs(), we tear down all the
infrastructure we have already allocated. However, the process of
mounting the log may have progressed to the point of reading,
caching and modifying buffers in memory. Hence before we can free
all the infrastructure, we have to flush and remove all the buffers
from memory.

Problem first reported by Eric Sandeen, later a different incarnation
was reported by Ben Myers.

Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Reviewed-by: Christoph Hellwig
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:21:02 +0800
12bcb3f7d xfs: Properly exclude IO type flags from buffer flags ... Browse Code »

Recent event tracing during a debugging session showed that flags
that define the IO type for a buffer are leaking into the flags on
the buffer incorrectly. Fix the flag exclusion mask in
xfs_buf_alloc() to avoid problems that may be caused by such
leakage.

Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Reviewed-by: Christoph Hellwig
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:21:01 +0800
ad1e95c54 xfs: clean up xfs_bit.h includes ... Browse Code »

With the removal of xfs_rw.h and other changes over time, xfs_bit.h
is being included in many files that don't actually need it. Clean
up the includes as necessary.

Also move the only-used-once xfs_ialloc_find_free() static inline
function out of a header file that is widely included to reduce
the number of needless dependencies on xfs_bit.h.

Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:21:00 +0800
2af51f3a4 xfs: move xfs_do_force_shutdown() and kill xfs_rw.c ... Browse Code »

xfs_do_force_shutdown now is the only thing in xfs_rw.c. There is no
need to keep it in it's own file anymore, so move it to xfs_fsops.c
next to xfs_fs_goingdown() and kill xfs_rw.c.

Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:20:59 +0800
2a0ec1d9e xfs: move xfs_get_extsz_hint() and kill xfs_rw.h ... Browse Code »

The only thing left in xfs_rw.h is a function prototype for an inode
function. Move that to xfs_inode.h, and kill xfs_rw.h.

Also move the function implementing the prototype from xfs_rw.c to
xfs_inode.c so we only have one function left in xfs_rw.c

Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Reviewed-by: Christoph Hellwig
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:20:58 +0800
fd50092c0 xfs: move xfs_fsb_to_db to xfs_bmap.h ... Browse Code »

This is the only remaining useful function in xfs_rw.h, so move it
to a header file responsible for block mapping functions that the
callers already include. Soon we can get rid of xfs_rw.h.

Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:20:57 +0800
4ecbfe637 xfs: clean up busy extent naming ... Browse Code »

Now that the busy extent tracking has been moved out of the
allocation files, clean up the namespace it uses to
"xfs_extent_busy" rather than a mix of "xfs_busy" and
"xfs_alloc_busy".

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:20:56 +0800
efc27b525 xfs: move busy extent handling to it's own file ... Browse Code »

To make it easier to handle userspace code merges, move all the busy
extent handling out of the allocation code and into it's own file.
The userspace code does not need the busy extent code, so this
simplifies the merging of the kernel code into the userspace
xfsprogs library.

Because the busy extent code has been almost completely rewritten
over the past couple of years, also update the copyright on this new
file to include the authors that made all those changes.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:20:55 +0800
60a34607b xfs: move xfsagino_t to xfs_types.h ... Browse Code »

Untangle the header file includes a bit by moving the definition of
xfs_agino_t to xfs_types.h. This removes the dependency that xfs_ag.h has on
xfs_inum.h, meaning we don't need to include xfs_inum.h everywhere we include
xfs_ag.h.

Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:20:54 +0800
bc4010ecb xfs: use iolock on XFS_IOC_ALLOCSP calls ... Browse Code »

fsstress has a particular effective way of stopping debug XFS
kernels. We keep seeing assert failures due finding delayed
allocation extents where there should be none. This shows up when
extracting extent maps and we are holding all the locks we should be
to prevent races, so this really makes no sense to see these errors.

After checking that fsstress does not use mmap, it occurred to me
that fsstress uses something that no sane application uses - the
XFS_IOC_ALLOCSP ioctl interfaces for preallocation. These interfaces
do allocation of blocks beyond EOF without using preallocation, and
then call setattr to extend and zero the allocated blocks.

THe problem here is this is a buffered write, and hence the
allocation is a delayed allocation. Unlike the buffered IO path, the
allocation and zeroing are not serialised using the IOLOCK. Hence
the ALLOCSP operation can race with operations holding the iolock to
prevent buffered IO operations from occurring.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:20:53 +0800
aa5c158ec xfs: kill XBF_DONTBLOCK ... Browse Code »

Just about all callers of xfs_buf_read() and xfs_buf_get() use XBF_DONTBLOCK.
This is used to make memory allocation use GFP_NOFS rather than GFP_KERNEL to
avoid recursion through memory reclaim back into the filesystem.

All the blocking get calls in growfs occur inside a transaction, even though
they are no part of the transaction, so all allocation will be GFP_NOFS due to
the task flag PF_TRANS being set. The blocking read calls occur during log
recovery, so they will probably be unaffected by converting to GFP_NOFS
allocations.

Hence make XBF_DONTBLOCK behaviour always occur for buffers and kill the flag.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:20:52 +0800
7ca790a50 xfs: kill xfs_read_buf() ... Browse Code »

xfs_read_buf() is effectively the same as xfs_trans_read_buf() when called
outside a transaction context. The error handling is slightly different in that
xfs_read_buf stales the errored buffer it gets back, but there is probably good
reason for xfs_trans_read_buf() for doing this.

Hence update xfs_trans_read_buf() to the same error handling as xfs_read_buf(),
and convert all the callers of xfs_read_buf() to use the former function. We can
then remove xfs_read_buf().

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:20:51 +0800
a8acad707 xfs: kill XBF_LOCK ... Browse Code »

Buffers are always returned locked from the lookup routines. Hence
we don't need to tell the lookup routines to return locked buffers,
on to try and lock them. Remove XBF_LOCK from all the callers and
from internal buffer cache usage.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:20:50 +0800
795cac72e xfs: kill xfs_buf_btoc ... Browse Code »

xfs_buf_btoc and friends are simple macros that do basic block
to page index conversion and vice versa. These aren't widely used,
and we use open coded masking and shifting everywhere else. Hence
remove the macros and open code the work they do.

Also, use of PAGE_CACHE_{SIZE|SHIFT|MASK} for these macros is now
incorrect - we are using pages directly and not the page cache, so
use PAGE_{SIZE|MASK|SHIFT} instead.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:20:49 +0800