Doug / smarc-fsl-linux-kernel | Embedian Git Server

30 Jul, 2012

2 commits

0b56185b0 xfs: do not take the iolock in xfs_inactive ... Browse Code »

An inode that enters xfs_inactive has been removed from all global
lists but the inode hash, and can't be recycled in xfs_iget before
it has been marked reclaimable. Thus taking the iolock in here
is not nessecary at all, and given the amount of lockdep false
positives it has triggered already I'd rather remove the locking.

The only change outside of xfs_inactive is relaxing an assert in
xfs_itruncate_extents.

Signed-off-by: Christoph Hellwig
Reviewed-by: Rich Johnston
Signed-off-by: Ben Myers

Christoph Hellwig
2012-07-30 05:16:49 +0800
08358906e xfs: remove the alloc_done argument to xfs_dialloc ... Browse Code »

We can simplify check the IO_agbp pointer for being non-NULL instead of
passing another argument through two layers of function calls.

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Christoph Hellwig
2012-07-30 05:00:31 +0800

22 Jul, 2012

2 commits

129dbc9a2 xfs: remove xfs_inotobp ... Browse Code »

There is no need to keep this helper around, opencoding it in the only
caller is just as clear.

Signed-off-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Christoph Hellwig
2012-07-22 23:55:36 +0800
475ee413f xfs: merge xfs_itobp into xfs_imap_to_bp ... Browse Code »

All callers of xfs_imap_to_bp want the dinode pointer, so let's calculate it
inside xfs_imap_to_bp. Once that is done xfs_itobp becomes a fairly pointless
wrapper which can be replaced with direct calls to xfs_imap_to_bp.

Signed-off-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Christoph Hellwig
2012-07-22 23:46:56 +0800

15 Jun, 2012

1 commit

32972383c xfs: make largest supported offset less shouty ... Browse Code »

XFS_MAXIOFFSET() is just a simple macro that resolves to
mp->m_maxioffset. It doesn't need to exist, and it just makes the
code unnecessarily loud and shouty.

Make it quiet and easy to read.

Signed-off-by: Dave Chinner
Reviewed-by: Eric Sandeen
Signed-off-by: Ben Myers

Dave Chinner
2012-06-15 01:28:24 +0800

15 May, 2012

8 commits

611c99468 xfs: make XBF_MAPPED the default behaviour ... Browse Code »

Rather than specifying XBF_MAPPED for almost all buffers, introduce
XBF_UNMAPPED for the couple of users that use unmapped buffers.

Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Reviewed-by: Christoph Hellwig
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:21:03 +0800
ad1e95c54 xfs: clean up xfs_bit.h includes ... Browse Code »

With the removal of xfs_rw.h and other changes over time, xfs_bit.h
is being included in many files that don't actually need it. Clean
up the includes as necessary.

Also move the only-used-once xfs_ialloc_find_free() static inline
function out of a header file that is widely included to reduce
the number of needless dependencies on xfs_bit.h.

Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:21:00 +0800
2a0ec1d9e xfs: move xfs_get_extsz_hint() and kill xfs_rw.h ... Browse Code »

The only thing left in xfs_rw.h is a function prototype for an inode
function. Move that to xfs_inode.h, and kill xfs_rw.h.

Also move the function implementing the prototype from xfs_rw.c to
xfs_inode.c so we only have one function left in xfs_rw.c

Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Reviewed-by: Christoph Hellwig
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:20:58 +0800
a8acad707 xfs: kill XBF_LOCK ... Browse Code »

Buffers are always returned locked from the lookup routines. Hence
we don't need to tell the lookup routines to return locked buffers,
on to try and lock them. Remove XBF_LOCK from all the callers and
from internal buffer cache usage.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:20:50 +0800
04913fdd9 xfs: pass shutdown method into xfs_trans_ail_delete_bulk ... Browse Code »

xfs_trans_ail_delete_bulk() can be called from different contexts so
if the item is not in the AIL we need different shutdown for each
context. Pass in the shutdown method needed so the correct action
can be taken.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:20:33 +0800
43ff2122e xfs: on-stack delayed write buffer lists ... Browse Code »

Queue delwri buffers on a local on-stack list instead of a per-buftarg one,
and write back the buffers per-process instead of by waking up xfsbufd.

This is now easily doable given that we have very few places left that write
delwri buffers:

- log recovery:
Only done at mount time, and already forcing out the buffers
synchronously using xfs_flush_buftarg

- quotacheck:
Same story.

- dquot reclaim:
Writes out dirty dquots on the LRU under memory pressure. We might
want to look into doing more of this via xfsaild, but it's already
more optimal than the synchronous inode reclaim that writes each
buffer synchronously.

- xfsaild:
This is the main beneficiary of the change. By keeping a local list
of buffers to write we reduce latency of writing out buffers, and
more importably we can remove all the delwri list promotions which
were hitting the buffer cache hard under sustained metadata loads.

The implementation is very straight forward - xfs_buf_delwri_queue now gets
a new list_head pointer that it adds the delwri buffers to, and all callers
need to eventually submit the list using xfs_buf_delwi_submit or
xfs_buf_delwi_submit_nowait. Buffers that already are on a delwri list are
skipped in xfs_buf_delwri_queue, assuming they already are on another delwri
list. The biggest change to pass down the buffer list was done to the AIL
pushing. Now that we operate on buffers the trylock, push and pushbuf log
item methods are merged into a single push routine, which tries to lock the
item, and if possible add the buffer that needs writeback to the buffer list.
This leads to much simpler code than the previous split but requires the
individual IOP_PUSH instances to unlock and reacquire the AIL around calls
to blocking routines.

Given that xfsailds now also handle writing out buffers, the conditions for
log forcing and the sleep times needed some small changes. The most
important one is that we consider an AIL busy as long we still have buffers
to push, and the other one is that we do increment the pushed LSN for
buffers that are under flushing at this moment, but still count them towards
the stuck items for restart purposes. Without this we could hammer on stuck
items without ever forcing the log and not make progress under heavy random
delete workloads on fast flash storage devices.

[ Dave Chinner:
- rebase on previous patches.
- improved comments for XBF_DELWRI_Q handling
- fix XBF_ASYNC handling in queue submission (test 106 failure)
- rename delwri submit function buffer list parameters for clarity
- xfs_efd_item_push() should return XFS_ITEM_PINNED ]

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Christoph Hellwig
2012-05-15 05:20:31 +0800
4c46819a8 xfs: do not write the buffer from xfs_iflush ... Browse Code »

Instead of writing the buffer directly from inside xfs_iflush return it to
the caller and let the caller decide what to do with the buffer. Also
remove the pincount check in xfs_iflush that all non-blocking callers already
implement and the now unused flags parameter.

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Christoph Hellwig
2012-05-15 05:20:28 +0800
32ce90a4b xfs: remove log item from AIL in xfs_iflush after a shutdown ... Browse Code »

If a filesystem has been forced shutdown we are never going to write inodes
to disk, which means the inode items will stay in the AIL until we free
the inode. Currently that is not a problem, but a pending change requires us
to empty the AIL before shutting down the filesystem. In that case leaving
the inode in the AIL is lethal. Make sure to remove the log item from the AIL
to allow emptying the AIL on shutdown filesystems.

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Christoph Hellwig
2012-05-15 05:20:25 +0800

14 Mar, 2012

2 commits

f5d8d5c4b xfs: split in-core and on-disk inode log item fields ... Browse Code »

Add a new ili_fields member to the inode log item to isolate the in-memory
flags from the ones that actually go to the log. This will allow tracking
timestamp-only updates for fdatasync and O_DSYNC in the next patch and
prepares for divorcing the on-disk log format from the in-memory log item
a little further down the road.

Reviewed-by: Dave Chinner
Signed-off-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Christoph Hellwig
2012-03-14 06:08:17 +0800
8a9c9980f xfs: log timestamp updates ... Browse Code »

Timestamps on regular files are the last metadata that XFS does not update
transactionally. Now that we use the delaylog mode exclusively and made
the log scode scale extremly well there is no need to bypass that code for
timestamp updates. Logging all updates allows to drop a lot of code, and
will allow for further performance improvements later on.

Note that this patch drops optimized handling of fdatasync - it will be
added back in a separate commit.

Reviewed-by: Dave Chinner
Signed-off-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Christoph Hellwig
2012-03-14 06:01:15 +0800

18 Jan, 2012

4 commits

ce7ae151d xfs: remove the i_size field in struct xfs_inode ... Browse Code »

There is no fundamental need to keep an in-memory inode size copy in the XFS
inode. We already have the on-disk value in the dinode, and the separate
in-memory copy that we need for regular files only in the XFS inode.

Remove the xfs_inode i_size field and change the XFS_ISIZE macro to use the
VFS inode i_size field for regular files. Switch code that was directly
accessing the i_size field in the xfs_inode to XFS_ISIZE, or in cases where
we are limited to regular files direct access of the VFS inode i_size field.

This also allows dropping some fairly complicated code in the write path
which dealt with keeping the xfs_inode i_size uptodate with the VFS i_size
that is getting updated inside ->write_end.

Note that we do not bother resetting the VFS i_size when truncating a file
that gets freed to zero as there is no point in doing so because the VFS inode
is no longer in use at this point. Just relax the assert in xfs_ifree to
only check the on-disk size instead.

Reviewed-by: Dave Chinner
Signed-off-by: Christoph Hellwig
Signed-off-by: Ben Myers

Christoph Hellwig
2012-01-18 05:08:53 +0800
f392e6319 xfs: replace i_pin_wait with a bit waitqueue ... Browse Code »

Replace i_pin_wait, which is only used during synchronous inode flushing
with a bit waitqueue. This trades off a much smaller inode against
slightly slower wakeup performance, and saves 12 (32-bit) or 20 (64-bit)
bytes in the XFS inode.

Reviewed-by: Alex Elder
Reviewed-by: Dave Chinner
Signed-off-by: Christoph Hellwig
Signed-off-by: Ben Myers

Christoph Hellwig
2012-01-18 05:07:54 +0800
474fce067 xfs: replace i_flock with a sleeping bitlock ... Browse Code »

We almost never block on i_flock, the exception is synchronous inode
flushing. Instead of bloating the inode with a 16/24-byte completion
that we abuse as a semaphore just implement it as a bitlock that uses
a bit waitqueue for the rare sleeping path. This primarily is a
tradeoff between a much smaller inode and a faster non-blocking
path vs faster wakeups, and we are much better off with the former.

A small downside is that we will lose lockdep checking for i_flock, but
given that it's always taken inside the ilock that should be acceptable.

Note that for example the inode writeback locking is implemented in a
very similar way.

Signed-off-by: Christoph Hellwig
Reviewed-by: Alex Elder
Signed-off-by: Ben Myers

Christoph Hellwig
2012-01-18 05:06:45 +0800
8096b1ebb xfs: remove the if_ext_max field in struct xfs_ifork ... Browse Code »

We spent a lot of effort to maintain this field, but it always equals to the
fork size divided by the constant size of an extent. The prime use of it is
to assert that the two stay in sync. Just divide the fork size by the extent
size in the few places that we actually use it and remove the overhead
of maintaining it. Also introduce a few helpers to consolidate the places
where we actually care about the value.

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Ben Myers

Christoph Hellwig
2012-01-18 05:02:28 +0800

14 Jan, 2012

1 commit

673e8e597 xfs: remove xfs_itruncate_data ... Browse Code »

This wrapper isn't overly useful, not to say rather confusing.

Around the call to xfs_itruncate_extents it does:

- add tracing
- add a few asserts in debug builds
- conditionally update the inode size in two places
- log the inode

Both the tracing and the inode logging can be moved to xfs_itruncate_extents
as they are useful for the attribute fork as well - in fact the attr code
already does an equivalent xfs_trans_log_inode call just after calling
xfs_itruncate_extents. The conditional size updates are a mess, and there
was no reason to do them in two places anyway, as the first one was
conditional on the inode having extents - but without extents we
xfs_itruncate_extents would be a no-op and the placement wouldn't matter
anyway. Instead move the size assignments and the asserts that make sense
to the callers that want it.

As a side effect of this clean up xfs_setattr_size by introducing variables
for the old and new inode size, and moving the size updates into a common
place.

Reviewed-by: Dave Chinner
Signed-off-by: Christoph Hellwig
Signed-off-by: Ben Myers

Christoph Hellwig
2012-01-14 02:11:45 +0800

04 Jan, 2012

1 commit

576b1d67c xfs: propagate umode_t ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-01-04 11:55:00 +0800

30 Nov, 2011

1 commit

4dd2cb4a2 xfs: force buffer writeback before blocking on the ilock in inode reclaim ... Browse Code »

If we are doing synchronous inode reclaim we block the VM from making
progress in memory reclaim. So if we encouter a flush locked inode
promote it in the delwri list and wake up xfsbufd to write it out now.
Without this we can get hangs of up to 30 seconds during workloads hitting
synchronous inode reclaim.

The scheme is copied from what we do for dquot reclaims.

Reported-by: Simon Kirby
Signed-off-by: Christoph Hellwig
Tested-by: Simon Kirby
Signed-off-by: Ben Myers

Christoph Hellwig
2011-11-30 02:06:14 +0800

12 Oct, 2011

7 commits

c867cb616 xfs: remove XFS_BUF_STALE and XFS_BUF_SUPER_STALE ... Browse Code »

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Alex Elder

Christoph Hellwig
2011-10-12 10:15:10 +0800
38f232324 xfs: remove XFS_BUF_SET_VTYPE and XFS_BUF_SET_VTYPE_REF ... Browse Code »

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Alex Elder

Christoph Hellwig
2011-10-12 10:15:09 +0800
ddc3415ab xfs: simplify xfs_trans_ijoin* again ... Browse Code »

There is no reason to keep a reference to the inode even if we unlock
it during transaction commit because we never drop a reference between
the ijoin and commit. Also use this fact to merge xfs_trans_ijoin_ref
back into xfs_trans_ijoin - the third argument decides if an unlock
is needed now.

I'm actually starting to wonder if allowing inodes to be unlocked
at transaction commit really is worth the effort. The only real
benefit is that they can be unlocked earlier when commiting a
synchronous transactions, but that could be solved by doing the
log force manually after the unlock, too.

Signed-off-by: Christoph Hellwig
Signed-off-by: Alex Elder

Christoph Hellwig
2011-10-12 10:15:08 +0800
5c8ed2021 xfs: introduce xfs_bmapi_read() ... Browse Code »

xfs_bmapi() currently handles both extent map reading and
allocation. As a result, the code is littered with "if (wr)"
branches to conditionally do allocation operations if required.
This makes the code much harder to follow and causes significant
indent issues with the code.

Given that read mapping is much simpler than allocation, we can
split out read mapping from xfs_bmapi() and reuse the logic that
we have already factored out do do all the hard work of handling the
extent map manipulations. The results in a much simpler function for
the common extent read operations, and will allow the allocation
code to be simplified in another commit.

Once xfs_bmapi_read() is implemented, convert all the callers of
xfs_bmapi() that are only reading extents to use the new function.

Signed-off-by: Dave Chinner
Signed-off-by: Christoph Hellwig
Signed-off-by: Alex Elder

Dave Chinner
2011-10-12 10:15:03 +0800
2a30f36d9 xfs: Check the return value of xfs_trans_get_buf() ... Browse Code »

Check the return value of xfs_trans_get_buf() and fail
appropriately.

Signed-off-by: Chandra Seetharaman
Signed-off-by: Alex Elder

Chandra Seetharaman
2011-10-12 10:15:01 +0800
c2b006c1d xfs: let xfs_bwrite callers handle the xfs_buf_relse ... Browse Code »

Remove the xfs_buf_relse from xfs_bwrite and let the caller handle it to
mirror the delwri and read paths.

Also remove the mount pointer passed to xfs_bwrite, which is superflous now
that we have a mount pointer in the buftarg.

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Alex Elder

Christoph Hellwig
2011-10-12 10:15:00 +0800
61551f1ee xfs: call xfs_buf_delwri_queue directly ... Browse Code »

Unify the ways we add buffers to the delwri queue by always calling
xfs_buf_delwri_queue directly. The xfs_bdwrite functions is removed and
opencoded in its callers, and the two places setting XBF_DELWRI while a
buffer is locked and expecting xfs_buf_unlock to pick it up are converted
to call xfs_buf_delwri_queue directly, too. Also replace the
XFS_BUF_UNDELAYWRITE macro with direct calls to xfs_buf_delwri_dequeue
to make the explicit queuing/dequeuing more obvious.

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Alex Elder

Christoph Hellwig
2011-10-12 10:14:59 +0800

08 Aug, 2011

1 commit

2ddb4e940 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux Browse Code »

Alex Elder
2011-08-08 20:06:24 +0800

27 Jul, 2011

1 commit

abbede1b3 xfs: get rid of open-coded S_ISREG(), etc. ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2011-07-27 03:05:16 +0800

26 Jul, 2011

2 commits

811e64c71 Replace the macro XFS_BUF_ISPINNED with helper xfs_buf_ispinned ... Browse Code »

Replace the macro XFS_BUF_ISPINNED with an inline helper function
xfs_buf_ispinned, and change all its usages.

Signed-off-by: Chandra Seetharaman
Reviewed-by: Christoph Hellwig
Signed-off-by: Alex Elder

Chandra Seetharaman
2011-07-26 04:03:22 +0800
5a52c2a58 xfs: Remove the macro XFS_BUF_ERROR and family ... Browse Code »

Remove the definitions and usage of the macros XFS_BUF_ERROR,
XFS_BUF_GETERROR and XFS_BUF_ISERROR.

Signed-off-by: Chandra Seetharaman
Reviewed-by: Christoph Hellwig
Signed-off-by: Alex Elder

Chandra Seetharaman
2011-07-26 03:57:46 +0800

13 Jul, 2011

3 commits

c84470dda xfs: remove leftovers of the old btree tracing code ... Browse Code »

Remove various bits left over from the old kdb-only btree tracing code, but
leave the actual trace point stubs in place to ease adding new event based
btree tracing.

Signed-off-by: Christoph Hellwig
Reviewed-by: Alex Elder
Reviewed-by: Dave Chinner

Christoph Hellwig
2011-07-13 19:43:50 +0800
cb669ca57 xfs: remove wrappers around b_iodone ... Browse Code »

Signed-off-by: Christoph Hellwig
Reviewed-by: Alex Elder
Reviewed-by: Dave Chinner

Christoph Hellwig
2011-07-13 19:43:49 +0800
adadbeefb xfs: remove wrappers around b_fspriv ... Browse Code »

Signed-off-by: Christoph Hellwig
Reviewed-by: Alex Elder
Reviewed-by: Dave Chinner

Christoph Hellwig
2011-07-13 19:43:49 +0800

08 Jul, 2011

4 commits

69ef921b5 xfs: byteswap constants instead of variables ... Browse Code »

Micro-optimize various comparisms by always byteswapping the constant
instead of the variable, which allows to do the swap at compile instead
of runtime.

Signed-off-by: Christoph Hellwig
Reviewed-by: Alex Elder
Reviewed-by: Dave Chinner

Christoph Hellwig
2011-07-08 20:36:05 +0800
f3ca87389 xfs: remove i_transp ... Browse Code »

Remove the transaction pointer in the inode. It's only used to avoid
passing down an argument in the bmap code, and for a few asserts in
the transaction code right now.

Also use the local variable ip in a few more places in xfs_inode_item_unlock,
so that it isn't only used for debug builds after the above change.

Signed-off-by: Christoph Hellwig
Reviewed-by: Alex Elder
Reviewed-by: Dave Chinner

Christoph Hellwig
2011-07-08 20:34:47 +0800
8f04c47aa xfs: split xfs_itruncate_finish ... Browse Code »

Split the guts of xfs_itruncate_finish that loop over the existing extents
and calls xfs_bunmapi on them into a new helper, xfs_itruncate_externs.
Make xfs_attr_inactive call it directly instead of xfs_itruncate_finish,
which allows to simplify the latter a lot, by only letting it deal with
the data fork. As a result xfs_itruncate_finish is renamed to
xfs_itruncate_data to make its use case more obvious.

Also remove the sync parameter from xfs_itruncate_data, which has been
unessecary since the introduction of the busy extent list in 2002, and
completely dead code since 2003 when the XFS_BMAPI_ASYNC parameter was
made a no-op.

I can't actually see why the xfs_attr_inactive needs to set the transaction
sync, but let's keep this patch simple and without changes in behaviour.

Also avoid passing a useless argument to xfs_isize_check, and make it
private to xfs_inode.c.

Signed-off-by: Christoph Hellwig
Reviewed-by: Alex Elder
Reviewed-by: Dave Chinner

Christoph Hellwig
2011-07-08 20:34:34 +0800
857b9778d xfs: kill xfs_itruncate_start ... Browse Code »

xfs_itruncate_start is a rather length wrapper that evaluates to a call
to xfs_ioend_wait and xfs_tosspages, and only has two callers.

Instead of using the complicated checks left over from IRIX where we
can to truncate the pagecache just call xfs_tosspages
(aka truncate_inode_pages) directly as we want to get rid of all data
after i_size, and truncate_inode_pages handles incorrect alignments
and too large offsets just fine.

Signed-off-by: Christoph Hellwig
Reviewed-by: Alex Elder
Reviewed-by: Dave Chinner

Christoph Hellwig
2011-07-08 20:34:30 +0800