Eric Lee / smarc-fsl-linux-kernel

27 Jul, 2010

3 commits

73523a2ec xfs: fix gcc 4.6 set but not read and unused statement warnings ... Browse Code »

[hch: dropped a few hunks that need structural changes instead]

Signed-off-by: Andi Kleen
Reviewed-by: Dave Chinner
Signed-off-by: Dave Chinner

Christoph Hellwig
2010-07-27 02:16:51 +0800
3400777ff xfs: remove unneeded #include statements ... Browse Code »

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner

Christoph Hellwig
2010-07-27 02:16:33 +0800
288699fec xfs: drop dmapi hooks ... Browse Code »

Dmapi support was never merged upstream, but we still have a lot of hooks
bloating XFS for it, all over the fast pathes of the filesystem.

This patch drops over 700 lines of dmapi overhead. If we'll ever get HSM
support in mainline at least the namespace events can be done much saner
in the VFS instead of the individual filesystem, so it's not like this
is much help for future work.

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner

Christoph Hellwig
2010-07-27 02:16:33 +0800

24 May, 2010

1 commit

ed3b4d6cd xfs: Improve scalability of busy extent tracking ... Browse Code »

When we free a metadata extent, we record it in the per-AG busy
extent array so that it is not re-used before the freeing
transaction hits the disk. This array is fixed size, so when it
overflows we make further allocation transactions synchronous
because we cannot track more freed extents until those transactions
hit the disk and are completed. Under heavy mixed allocation and
freeing workloads with large log buffers, we can overflow this array
quite easily.

Further, the array is sparsely populated, which means that inserts
need to search for a free slot, and array searches often have to
search many more slots that are actually used to check all the
busy extents. Quite inefficient, really.

To enable this aspect of extent freeing to scale better, we need
a structure that can grow dynamically. While in other areas of
XFS we have used radix trees, the extents being freed are at random
locations on disk so are better suited to being indexed by an rbtree.

So, use a per-AG rbtree indexed by block number to track busy
extents. This incures a memory allocation when marking an extent
busy, but should not occur too often in low memory situations. This
should scale to an arbitrary number of extents so should not be a
limitation for features such as in-memory aggregation of
transactions.

However, there are still situations where we can't avoid allocating
busy extents (such as allocation from the AGFL). To minimise the
overhead of such occurences, we need to avoid doing a synchronous
log force while holding the AGF locked to ensure that the previous
transactions are safely on disk before we use the extent. We can do
this by marking the transaction doing the allocation as synchronous
rather issuing a log force.

Because of the locking involved and the ordering of transactions,
the synchronous transaction provides the same guarantees as a
synchronous log force because it ensures that all the prior
transactions are already on disk when the synchronous transaction
hits the disk. i.e. it preserves the free->allocate order of the
extent correctly in recovery.

By doing this, we avoid holding the AGF locked while log writes are
in progress, hence reducing the length of time the lock is held and
therefore we increase the rate at which we can allocate and free
from the allocation group, thereby increasing overall throughput.

The only problem with this approach is that when a metadata buffer is
marked stale (e.g. a directory block is removed), then buffer remains
pinned and locked until the log goes to disk. The issue here is that
if that stale buffer is reallocated in a subsequent transaction, the
attempt to lock that buffer in the transaction will hang waiting
the log to go to disk to unlock and unpin the buffer. Hence if
someone tries to lock a pinned, stale, locked buffer we need to
push on the log to get it unlocked ASAP. Effectively we are trading
off a guaranteed log force for a much less common trigger for log
force to occur.

Ideally we should not reallocate busy extents. That is a much more
complex fix to the problem as it involves direct intervention in the
allocation btree searches in many places. This is left to a future
set of modifications.

Finally, now that we track busy extents in allocated memory, we
don't need the descriptors in the transaction structure to point to
them. We can replace the complex busy chunk infrastructure with a
simple linked list of busy extents. This allows us to remove a large
chunk of code, making the overall change a net reduction in code
size.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Signed-off-by: Alex Elder

Dave Chinner
2010-05-24 23:34:00 +0800

22 Jan, 2010

2 commits

a14a348bf xfs: cleanup up xfs_log_force calling conventions ... Browse Code »

Remove the XFS_LOG_FORCE argument which was always set, and the
XFS_LOG_URGE define, which was never used.

Split xfs_log_force into a two helpers - xfs_log_force which forces
the whole log, and xfs_log_force_lsn which forces up to the
specified LSN. The underlying implementations already were entirely
separate, as were the users.

Also re-indent the new _xfs_log_force/_xfs_log_force which
previously had a weird coding style.

Signed-off-by: Christoph Hellwig
Signed-off-by: Alex Elder

Christoph Hellwig
2010-01-22 03:44:49 +0800
0cadda1c5 xfs: remove duplicate buffer flags ... Browse Code »

Currently we define aliases for the buffer flags in various
namespaces, which only adds confusion. Remove all but the XBF_
flags to clean this up a bit.

Note that we still abuse XFS_B_ASYNC/XBF_ASYNC for some non-buffer
uses, but I'll clean that up later.

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Alex Elder

Christoph Hellwig
2010-01-22 03:44:36 +0800

16 Jan, 2010

3 commits

e57336ff7 xfs: embed the pagb_list array in the perag structure ... Browse Code »

Now that the perag structure is allocated memory rather than held in
an array, we don't need to have the busy extent array external to
the structure. Embed it into the perag structure to avoid needing an
extra allocation when setting up.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Signed-off-by: Alex Elder

Dave Chinner
2010-01-16 05:34:39 +0800
1c1c6ebcf xfs: Replace per-ag array with a radix tree ... Browse Code »

The use of an array for the per-ag structures requires reallocation
of the array when growing the filesystem. This requires locking
access to the array to avoid use after free situations, and the
locking is difficult to get right. To avoid needing to reallocate an
array, change the per-ag structures to an allocated object per ag
and index them using a tree structure.

The AGs are always densely indexed (hence the use of an array), but
the number supported is 2^32 and lookups tend to be random and hence
indexing needs to scale. A simple choice is a radix tree - it works
well with this sort of index. This change also removes another
large contiguous allocation from the mount/growfs path in XFS.

The growing process now needs to change to only initialise the new
AGs required for the extra space, and as such only needs to
exclusively lock the tree for inserts. The rest of the code only
needs to lock the tree while doing lookups, and hence this will
remove all the deadlocks that currently occur on the m_perag_lock as
it is now an innermost lock. The lock is also changed to a spinlock
from a read/write lock as the hold time is now extremely short.

To complete the picture, the per-ag structures will need to be
reference counted to ensure that we don't free/modify them while
they are still in use. This will be done in subsequent patch.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Signed-off-by: Alex Elder

Dave Chinner
2010-01-16 05:33:52 +0800
a862e0fdc xfs: Don't directly reference m_perag in allocation code ... Browse Code »

Start abstracting the perag references so that the indexing of the
structures is not directly coded into all the places that uses the
perag structures. This will allow us to separate the use of the
perag structure and the way it is indexed and hence avoid the known
deadlocks related to growing a busy filesystem.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Signed-off-by: Alex Elder

Dave Chinner
2010-01-16 05:33:12 +0800

11 Jan, 2010

1 commit

fd45e4784 xfs: Ensure we force all busy extents in range to disk ... Browse Code »

When we search for and find a busy extent during allocation we
force the log out to ensure the extent free transaction is on
disk before the allocation transaction. The current implementation
has a subtle bug in it--it does not handle multiple overlapping
ranges.

That is, if we free lots of little extents into a single
contiguous extent, then allocate the contiguous extent, the busy
search code stops searching at the first extent it finds that
overlaps the allocated range. It then uses the commit LSN of the
transaction to force the log out to.

Unfortunately, the other busy ranges might have more recent
commit LSNs than the first busy extent that is found, and this
results in xfs_alloc_search_busy() returning before all the
extent free transactions are on disk for the range being
allocated. This can lead to potential metadata corruption or
stale data exposure after a crash because log replay won't replay
all the extent free transactions that cover the allocation range.

Modified-by: Alex Elder

(Dropped the "found" argument from the xfs_alloc_busysearch trace
event.)

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Signed-off-by: Alex Elder

Dave Chinner
2010-01-11 02:22:02 +0800

15 Dec, 2009

1 commit

0b1b213fc xfs: event tracing support ... Browse Code »

Convert the old xfs tracing support that could only be used with the
out of tree kdb and xfsidbg patches to use the generic event tracer.

To use it make sure CONFIG_EVENT_TRACING is enabled and then enable
all xfs trace channels by:

echo 1 > /sys/kernel/debug/tracing/events/xfs/enable

or alternatively enable single events by just doing the same in one
event subdirectory, e.g.

echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_ihold/enable

or set more complex filters, etc. In Documentation/trace/events.txt
all this is desctribed in more detail. To reads the events do a

cat /sys/kernel/debug/tracing/trace

Compared to the last posting this patch converts the tracing mostly to
the one tracepoint per callsite model that other users of the new
tracing facility also employ. This allows a very fine-grained control
of the tracing, a cleaner output of the traces and also enables the
perf tool to use each tracepoint as a virtual performance counter,
allowing us to e.g. count how often certain workloads git various
spots in XFS. Take a look at

http://lwn.net/Articles/346470/

for some examples.

Also the btree tracing isn't included at all yet, as it will require
additional core tracing features not in mainline yet, I plan to
deliver it later.

And the really nice thing about this patch is that it actually removes
many lines of code while adding this nice functionality:

fs/xfs/Makefile | 8
fs/xfs/linux-2.6/xfs_acl.c | 1
fs/xfs/linux-2.6/xfs_aops.c | 52 -
fs/xfs/linux-2.6/xfs_aops.h | 2
fs/xfs/linux-2.6/xfs_buf.c | 117 +--
fs/xfs/linux-2.6/xfs_buf.h | 33
fs/xfs/linux-2.6/xfs_fs_subr.c | 3
fs/xfs/linux-2.6/xfs_ioctl.c | 1
fs/xfs/linux-2.6/xfs_ioctl32.c | 1
fs/xfs/linux-2.6/xfs_iops.c | 1
fs/xfs/linux-2.6/xfs_linux.h | 1
fs/xfs/linux-2.6/xfs_lrw.c | 87 --
fs/xfs/linux-2.6/xfs_lrw.h | 45 -
fs/xfs/linux-2.6/xfs_super.c | 104 ---
fs/xfs/linux-2.6/xfs_super.h | 7
fs/xfs/linux-2.6/xfs_sync.c | 1
fs/xfs/linux-2.6/xfs_trace.c | 75 ++
fs/xfs/linux-2.6/xfs_trace.h | 1369 +++++++++++++++++++++++++++++++++++++++++
fs/xfs/linux-2.6/xfs_vnode.h | 4
fs/xfs/quota/xfs_dquot.c | 110 ---
fs/xfs/quota/xfs_dquot.h | 21
fs/xfs/quota/xfs_qm.c | 40 -
fs/xfs/quota/xfs_qm_syscalls.c | 4
fs/xfs/support/ktrace.c | 323 ---------
fs/xfs/support/ktrace.h | 85 --
fs/xfs/xfs.h | 16
fs/xfs/xfs_ag.h | 14
fs/xfs/xfs_alloc.c | 230 +-----
fs/xfs/xfs_alloc.h | 27
fs/xfs/xfs_alloc_btree.c | 1
fs/xfs/xfs_attr.c | 107 ---
fs/xfs/xfs_attr.h | 10
fs/xfs/xfs_attr_leaf.c | 14
fs/xfs/xfs_attr_sf.h | 40 -
fs/xfs/xfs_bmap.c | 507 +++------------
fs/xfs/xfs_bmap.h | 49 -
fs/xfs/xfs_bmap_btree.c | 6
fs/xfs/xfs_btree.c | 5
fs/xfs/xfs_btree_trace.h | 17
fs/xfs/xfs_buf_item.c | 87 --
fs/xfs/xfs_buf_item.h | 20
fs/xfs/xfs_da_btree.c | 3
fs/xfs/xfs_da_btree.h | 7
fs/xfs/xfs_dfrag.c | 2
fs/xfs/xfs_dir2.c | 8
fs/xfs/xfs_dir2_block.c | 20
fs/xfs/xfs_dir2_leaf.c | 21
fs/xfs/xfs_dir2_node.c | 27
fs/xfs/xfs_dir2_sf.c | 26
fs/xfs/xfs_dir2_trace.c | 216 ------
fs/xfs/xfs_dir2_trace.h | 72 --
fs/xfs/xfs_filestream.c | 8
fs/xfs/xfs_fsops.c | 2
fs/xfs/xfs_iget.c | 111 ---
fs/xfs/xfs_inode.c | 67 --
fs/xfs/xfs_inode.h | 76 --
fs/xfs/xfs_inode_item.c | 5
fs/xfs/xfs_iomap.c | 85 --
fs/xfs/xfs_iomap.h | 8
fs/xfs/xfs_log.c | 181 +----
fs/xfs/xfs_log_priv.h | 20
fs/xfs/xfs_log_recover.c | 1
fs/xfs/xfs_mount.c | 2
fs/xfs/xfs_quota.h | 8
fs/xfs/xfs_rename.c | 1
fs/xfs/xfs_rtalloc.c | 1
fs/xfs/xfs_rw.c | 3
fs/xfs/xfs_trans.h | 47 +
fs/xfs/xfs_trans_buf.c | 62 -
fs/xfs/xfs_vnodeops.c | 8
70 files changed, 2151 insertions(+), 2592 deletions(-)

Signed-off-by: Christoph Hellwig
Signed-off-by: Alex Elder

Christoph Hellwig
2009-12-15 13:08:16 +0800

16 Mar, 2009

1 commit

6cc87645e xfs: factor out code to find the longest free extent in the AG ... Browse Code »

Signed-off-by: Dave Chinner
Signed-off-by: Christoph Hellwig

Dave Chinner
2009-03-16 15:29:46 +0800

01 Dec, 2008

1 commit

4805621a3 [XFS] factor out xfs_read_agf helper ... Browse Code »

Add a helper to read the AGF header and perform basic verification.
Based on hunks from a larger patch from Dave Chinner.

(First sent on Juli 23rd)

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Niv Sardi

From: Christoph Hellwig
2008-12-01 08:37:20 +0800

30 Oct, 2008

10 commits

7cc95a821 [XFS] Always use struct xfs_btree_block instead of short / longform ... Browse Code »

structures.

Always use the generic xfs_btree_block type instead of the short / long
structures. Add XFS_BTREE_SBLOCK_LEN / XFS_BTREE_LBLOCK_LEN defines for
the length of a short / long form block. The rationale for this is that we
will grow more btree block header variants to support CRCs and other RAS
information, and always accessing them through the same datatype with
unions for the short / long form pointers makes implementing this much
easier.

SGI-PV: 988146

SGI-Modid: xfs-linux-melb:xfs-kern:32300a

Signed-off-by: Christoph Hellwig
Signed-off-by: Donald Douwsma
Signed-off-by: David Chinner
Signed-off-by: Lachlan McIlroy

Christoph Hellwig
2008-10-30 14:14:34 +0800
89b283931 [XFS] Check agf_btreeblks is valid when reading in the AGF ... Browse Code »

SGI-PV: 987683

SGI-Modid: xfs-linux-melb:xfs-kern:32232a

Signed-off-by: Barry Naujok
Signed-off-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy

Barry Naujok
2008-10-30 14:05:49 +0800
8cc938fe4 [XFS] implement generic xfs_btree_get_rec ... Browse Code »

Not really much reason to make it generic given that it's so small, but
this is the last non-method in xfs_alloc_btree.c and xfs_ialloc_btree.c,
so it makes the whole btree implementation more structured.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32206a

Signed-off-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy
Signed-off-by: Bill O'Donnell
Signed-off-by: David Chinner

Christoph Hellwig
2008-10-30 13:58:11 +0800
91cca5df9 [XFS] implement generic xfs_btree_delete/delrec ... Browse Code »

Make the btree delete code generic. Based on a patch from David Chinner
with lots of changes to follow the original btree implementations more
closely. While this loses some of the generic helper routines for
inserting/moving/removing records it also solves some of the one off bugs
in the original code and makes it easier to verify.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32205a

Signed-off-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy
Signed-off-by: Bill O'Donnell
Signed-off-by: David Chinner

Christoph Hellwig
2008-10-30 13:58:01 +0800
4b22a5718 [XFS] implement generic xfs_btree_insert/insrec ... Browse Code »

Make the btree insert code generic. Based on a patch from David Chinner
with lots of changes to follow the original btree implementations more
closely. While this loses some of the generic helper routines for
inserting/moving/removing records it also solves some of the one off bugs
in the original code and makes it easier to verify.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32202a

Signed-off-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy
Signed-off-by: Bill O'Donnell
Signed-off-by: David Chinner

Christoph Hellwig
2008-10-30 13:57:40 +0800
278d0ca14 [XFS] implement generic xfs_btree_update ... Browse Code »

From: Dave Chinner

The most complicated part here is the lastrec tracking for the alloc
btree. Most logic is in the update_lastrec method which has to do some
hopefully good enough dirty magic to maintain it.

[hch: split out from bigger patch and a rework of the lastrec

logic]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32194a

Signed-off-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy
Signed-off-by: Bill O'Donnell
Signed-off-by: David Chinner

Christoph Hellwig
2008-10-30 13:56:32 +0800
fe033cc84 [XFS] implement generic xfs_btree_lookup ... Browse Code »

From: Dave Chinner

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32192a

Signed-off-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy
Signed-off-by: Bill O'Donnell
Signed-off-by: David Chinner

Christoph Hellwig
2008-10-30 13:56:09 +0800
8df4da4a0 [XFS] implement generic xfs_btree_decrement ... Browse Code »

From: Dave Chinner

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32191a

Signed-off-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy
Signed-off-by: Bill O'Donnell
Signed-off-by: David Chinner

Christoph Hellwig
2008-10-30 13:55:58 +0800
637aa50f4 [XFS] implement generic xfs_btree_increment ... Browse Code »

From: Dave Chinner

Because this is the first major generic btree routine this patch includes
some infrastrucure, first a few routines to deal with a btree block that
can be either in short or long form, second xfs_btree_read_buf_block,
which is the new central routine to read a btree block given a cursor, and
third the new xfs_btree_ptr_addr routine to calculate the address for a
given btree pointer record.

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32190a

Signed-off-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy
Signed-off-by: Bill O'Donnell
Signed-off-by: David Chinner

Christoph Hellwig
2008-10-30 13:55:45 +0800
561f7d173 [XFS] split up xfs_btree_init_cursor ... Browse Code »

xfs_btree_init_cursor contains close to little shared code for the
different btrees and will get even more non-common code in the future.
Split it up into one routine per btree type.

Because xfs_btree_dup_cursor needs to call the init routine for a generic
btree cursor add a new btree operation vector that contains a dup_cursor
method that initializes a new cursor based on an existing one.

The btree operations vector is based on an idea and code from Dave Chinner
and will grow more entries later during this series.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32176a

Signed-off-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy
Signed-off-by: Bill O'Donnell
Signed-off-by: David Chinner

Christoph Hellwig
2008-10-30 13:53:59 +0800

18 Apr, 2008

4 commits

e6430037e [XFS] fix logic error in xfs_alloc_ag_vextent_near() ... Browse Code »

Fix a logic error in xfs_alloc_ag_vextent_near(). This is a regression
introduced by the error handling changes.

SGI-PV: 890084
SGI-Modid: xfs-linux-melb:xfs-kern:30838a

Signed-off-by: David Chinner
Signed-off-by: Barry Naujok
Signed-off-by: Lachlan McIlroy

David Chinner
2008-04-18 10:03:02 +0800
12375c823 [XFS] Make xfs_alloc_compute_aligned() void. ... Browse Code »

xfs_alloc_compute_aligned() returns a value based on a comparison of the
computed extent length and the minimum length allowed. This is only used
by some callers - the other four return parameters are used more often.
Hence move the comparison to the code that actually needs to do it and
make xfs_alloc_compute_aligned() a void function.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30797a

Signed-off-by: David Chinner
Signed-off-by: Niv Sardi
Signed-off-by: Lachlan McIlroy

David Chinner
2008-04-18 09:58:36 +0800
f4586e406 [XFS] Clean up xfs_alloc_search_busy() return values. ... Browse Code »

xfs_alloc_search_busy() returns an index into the busy array if the extent
was found in the array. This is never checked, and the
xfs_alloc_search_busy() does a log force to prevent reuse of the extent
before the free transaction hits the disk. Hence the return value is
useless. Declare the function void and remove the slot number from the
tracing as well.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30796a

Signed-off-by: David Chinner
Signed-off-by: Niv Sardi
Signed-off-by: Lachlan McIlroy

David Chinner
2008-04-18 09:58:27 +0800
34a622b2e [XFS] replace remaining __FUNCTION__ occurrences ... Browse Code »

__FUNCTION__ is gcc-specific, use __func__

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30775a

Signed-off-by: Harvey Harrison
Signed-off-by: David Chinner
Signed-off-by: Lachlan McIlroy

Harvey Harrison
2008-04-18 09:51:26 +0800

14 Feb, 2008

1 commit

413d57c99 xfs: convert beX_add to beX_add_cpu (new common API) ... Browse Code »

remove beX_add functions and replace all uses with beX_add_cpu

Signed-off-by: Marcin Slusarz
Cc: Mark Fasheh
Reviewed-by: Dave Chinner
Cc: Timothy Shimmin
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Marcin Slusarz
2008-02-14 08:21:19 +0800

07 Feb, 2008

2 commits

007c61c68 [XFS] Remove spin.h ... Browse Code »

remove spinlock init abstraction macro in spin.h, remove the callers, and
remove the file. Move no-op spinlock_destroy to xfs_linux.h Cleanup
spinlock locals in xfs_mount.c

SGI-PV: 970382
SGI-Modid: xfs-linux-melb:xfs-kern:29751a

Signed-off-by: Eric Sandeen
Signed-off-by: Donald Douwsma
Signed-off-by: Lachlan McIlroy
Signed-off-by: Tim Shimmin

Eric Sandeen
2008-02-07 13:47:45 +0800
64137e56d [XFS] Unwrap pagb_lock. ... Browse Code »

Un-obfuscate pagb_lock, remove mutex_lock->spin_lock macros, call
spin_lock directly, remove extraneous cookie holdover from old xfs code,
and change lock type to spinlock_t.

SGI-PV: 970382
SGI-Modid: xfs-linux-melb:xfs-kern:29743a

Signed-off-by: Eric Sandeen
Signed-off-by: Donald Douwsma
Signed-off-by: Tim Shimmin

Eric Sandeen
2008-02-07 13:46:39 +0800

14 Jul, 2007

2 commits

3a59c94c4 [XFS] Clean up function name handling in tracing code ... Browse Code »

Remove the hardcoded "fnames" for tracing, and just embed them in tracing
macros via __FUNCTION__. Kills a lot of #ifdefs too.

SGI-PV: 967353
SGI-Modid: xfs-linux-melb:xfs-kern:29099a

Signed-off-by: Eric Sandeen
Signed-off-by: David Chinner
Signed-off-by: Tim Shimmin

Eric Sandeen
2007-07-14 13:41:24 +0800
92821e2ba [XFS] Lazy Superblock Counters ... Browse Code »

When we have a couple of hundred transactions on the fly at once, they all
typically modify the on disk superblock in some way.
create/unclink/mkdir/rmdir modify inode counts, allocation/freeing modify
free block counts.

When these counts are modified in a transaction, they must eventually lock
the superblock buffer and apply the mods. The buffer then remains locked
until the transaction is committed into the incore log buffer. The result
of this is that with enough transactions on the fly the incore superblock
buffer becomes a bottleneck.

The result of contention on the incore superblock buffer is that
transaction rates fall - the more pressure that is put on the superblock
buffer, the slower things go.

The key to removing the contention is to not require the superblock fields
in question to be locked. We do that by not marking the superblock dirty
in the transaction. IOWs, we modify the incore superblock but do not
modify the cached superblock buffer. In short, we do not log superblock
modifications to critical fields in the superblock on every transaction.
In fact we only do it just before we write the superblock to disk every
sync period or just before unmount.

This creates an interesting problem - if we don't log or write out the
fields in every transaction, then how do the values get recovered after a
crash? the answer is simple - we keep enough duplicate, logged information
in other structures that we can reconstruct the correct count after log
recovery has been performed.

It is the AGF and AGI structures that contain the duplicate information;
after recovery, we walk every AGI and AGF and sum their individual
counters to get the correct value, and we do a transaction into the log to
correct them. An optimisation of this is that if we have a clean unmount
record, we know the value in the superblock is correct, so we can avoid
the summation walk under normal conditions and so mount/recovery times do
not change under normal operation.

One wrinkle that was discovered during development was that the blocks
used in the freespace btrees are never accounted for in the AGF counters.
This was once a valid optimisation to make; when the filesystem is full,
the free space btrees are empty and consume no space. Hence when it
matters, the "accounting" is correct. But that means the when we do the
AGF summations, we would not have a correct count and xfs_check would
complain. Hence a new counter was added to track the number of blocks used
by the free space btrees. This is an *on-disk format change*.

As a result of this, lazy superblock counters are a mkfs option and at the
moment on linux there is no way to convert an old filesystem. This is
possible - xfs_db can be used to twiddle the right bits and then
xfs_repair will do the format conversion for you. Similarly, you can
convert backwards as well. At some point we'll add functionality to
xfs_admin to do the bit twiddling easily....

SGI-PV: 964999
SGI-Modid: xfs-linux-melb:xfs-kern:28652a

Signed-off-by: David Chinner
Signed-off-by: Christoph Hellwig
Signed-off-by: Tim Shimmin

David Chinner
2007-07-14 13:28:50 +0800

08 May, 2007

1 commit

e7a23a9b3 [XFS] reducing the number of random number functions. ... Browse Code »

Patch provided by Joe Perches

SGI-PV: 961696
SGI-Modid: xfs-linux-melb:xfs-kern:28209a

Signed-off-by: Joe Perches
Signed-off-by: Lachlan McIlroy
Signed-off-by: Tim Shimmin

Joe Perches
2007-05-08 11:49:03 +0800

28 Sep, 2006

2 commits

d432c80e6 [XFS] Minor code rearranging and cleanup to prevent some coverity false ... Browse Code »

positives.

SGI-PV: 955502
SGI-Modid: xfs-linux-melb:xfs-kern:26805a

Signed-off-by: Nathan Scott
Signed-off-by: Tim Shimmin

Nathan Scott
2006-09-28 09:03:44 +0800
e21010053 [XFS] endianess annotation for xfs_agfl_t. Trivial, xfs_agfl_t is always ... Browse Code »

used for ondisk values.

SGI-PV: 954580
SGI-Modid: xfs-linux-melb:xfs-kern:26553a

Signed-off-by: Christoph Hellwig
Signed-off-by: Nathan Scott
Signed-off-by: Tim Shimmin

Christoph Hellwig
2006-09-28 08:56:51 +0800

10 Aug, 2006

1 commit

0e1edbd99 [XFS] Fix xfs_free_extent related NULL pointer dereference. ... Browse Code »

We recently fixed an out-of-space deadlock in XFS, and part of that fix
involved the addition of the XFS_ALLOC_FLAG_FREEING flag to some of the
space allocator calls to indicate they're freeing space, not allocating
it. There was a missed xfs_alloc_fix_freelist condition test that did not
correctly test "flags". The same test would also test an uninitialised
structure field (args->userdata) and depending on its value either would
or would not return early with a critical buffer pointer set to NULL.

This fixes that up, adds asserts to several places to catch future botches
of this nature, and skips sections of xfs_alloc_fix_freelist that are
irrelevent for the space-freeing case.

SGI-PV: 955303
SGI-Modid: xfs-linux-melb:xfs-kern:26743a

Signed-off-by: Nathan Scott

Nathan Scott
2006-08-10 12:40:41 +0800

20 Jun, 2006

1 commit

f6c2d1fa6 [XFS] Remove version 1 directory code. Never functioned on Linux, just ... Browse Code »

pure bloat.

SGI-PV: 952969
SGI-Modid: xfs-linux-melb:xfs-kern:26251a

Signed-off-by: Nathan Scott

Nathan Scott
2006-06-20 11:04:51 +0800

09 Jun, 2006

1 commit

d210a28cd [XFS] In actual allocation of file system blocks and freeing extents, the ... Browse Code »

transaction within each such operation may involve multiple locking of AGF
buffer. While the freeing extent function has sorted the extents based on
AGF number before entering into transaction, however, when the file system
space is very limited, the allocation of space would try every AGF to get
space allocated, this could potentially cause out-of-order locking, thus
deadlock could happen. This fix mitigates the scarce space for allocation
by setting aside a few blocks without reservation, and avoid deadlock by
maintaining ascending order of AGF locking.

SGI-PV: 947395
SGI-Modid: xfs-linux-melb:xfs-kern:210801a

Signed-off-by: Yingping Lu
Signed-off-by: Nathan Scott

Yingping Lu
2006-06-09 12:55:18 +0800

08 May, 2006

1 commit

e63a36900 [XFS] Fix a possible metadata buffer (AGFL) refcount leak when fixing an ... Browse Code »

AG freelist.

SGI-PV: 952681
SGI-Modid: xfs-linux-melb:xfs-kern:25902a

Signed-off-by: Nathan Scott

Nathan Scott
2006-05-08 17:51:58 +0800

29 Mar, 2006

1 commit

c41564b5a [XFS] We really suck at spulling. Thanks to Chris Pascoe for fixing all ... Browse Code »

these typos.

SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:25539a

Signed-off-by: Nathan Scott

Nathan Scott
2006-03-29 06:55:14 +0800