Doug / smarc-fsl-linux-kernel | Embedian Git Server

14 Aug, 2013

1 commit

4bb928cdb xfs: split the CIL lock ... Browse Code »

The xc_cil_lock is used for two purposes - to protect the CIL
itself, and to protect the push/commit state and lists. These are
two logically separate structures and operations, so can have their
own locks. This means that pushing on the CIL and the commit wait
ordering won't contend for a lock with other transactions that are
completing concurrently. As the CIL insertion is the hottest path
throught eh CIL, this is a big win.

Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2013-08-14 05:21:21 +0800

13 Aug, 2013

1 commit

fc06c6d06 xfs: separate out log format definitions ... Browse Code »

The on-disk format definitions for the log are spread randoms
through a couple of header files. Consolidate it all in a single
file that can be shared easily with userspace. This means that
xfs_log.h and xfs_log_priv.h no longer need to be shared with
userspace.

Signed-off-by: Dave Chinner
Reviewed-by: Brian Foster
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2013-08-13 05:03:51 +0800

17 Apr, 2013

1 commit

d4fd0e92f xfs: Remove the obsolete XLOG_CIL_HARD_SPACE_LIMIT() macros ... Browse Code »

There is no more users of this Macro, so it's time to kill it dead.

Signed-off-by: Jie Liu
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Jeff Liu
2013-04-17 02:18:33 +0800

04 Dec, 2012

1 commit

f9668a09e xfs: fix sparse reported log CRC endian issue ... Browse Code »

Not a bug as such, just warning noise from the xlog_cksum()
returning a __be32 type when it should be returning a __le32 type.

On Wed, Nov 28, 2012 at 08:30:59AM -0500, Christoph Hellwig wrote:
> But why are we storing the crc field little endian while all other on
> disk formats are big endian? (And yes I realize it might as well have
> been me who did that back in the idea, but I still have no idea why)

Because the CRC always returns the calcuation LE format, even on BE
systems. So rather than always having to byte swap it everywhere and
have all the force casts and anootations for sparse, it seems simpler to
just make it a __le32 everywhere....

Signed-off-by: Dave Chinner
Reviewed-by: Ben Myers
Reviewed-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2012-12-04 02:10:59 +0800

20 Nov, 2012

1 commit

0e446be44 xfs: add CRC checks to the log ... Browse Code »

Implement CRCs for the log buffers. We re-use a field in
struct xlog_rec_header that was used for a weak checksum of the
log buffer payload in debug builds before.

The new checksumming uses the crc32c checksum we will use elsewhere
in XFS, and also protects the record header and addition cycle data.

Due to this there are some interesting changes in xlog_sync, as we
need to do the cycle wrapping for the split buffer case much earlier,
as we would touch the buffer after generating the checksum otherwise.

The CRC calculation is always enabled, even for non-CRC filesystems,
as adding this CRC does not change the log format. On non-CRC
filesystems, only issue an alert if a CRC mismatch is found and
allow recovery to continue - this will act as an indicator that
log recovery problems are a result of log corruption. On CRC enabled
filesystems, however, log recovery will fail.

Note that existing debug kernels will write a simple checksum value
to the log, so the first time this is run on a filesystem taht was
last used on a debug kernel it will through CRC mismatch warning
errors. These can be ignored.

Initially based on a patch from Dave Chinner, then modified
significantly by Christoph Hellwig. Modified again by Dave Chinner
to get to this version.

Signed-off-by: Christoph Hellwig
Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Christoph Hellwig
2012-11-20 10:18:41 +0800

18 Oct, 2012

1 commit

f661f1e0b xfs: sync work is now only periodic log work ... Browse Code »

The only thing the periodic sync work does now is flush the AIL and
idle the log. These are really functions of the log code, so move
the work to xfs_log.c and rename it appropriately.

The only wart that this leaves behind is the xfssyncd_centisecs
sysctl, otherwise the xfssyncd is dead. Clean up any comments that
related to xfssyncd to reflect it's passing.

Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Reviewed-by: Christoph Hellwig
Signed-off-by: Ben Myers

Dave Chinner
2012-10-18 00:53:29 +0800

22 Jun, 2012

2 commits

9a8d2fdbb xfs: remove xlog_t typedef ... Browse Code »

Remove the xlog_t type definitions.

Signed-off-by: Mark Tinguely
Reviewed-by: Christoph Hellwig
Signed-off-by: Ben Myers

Mark Tinguely
2012-06-22 03:22:27 +0800
ad223e603 xfs: rename log structure to xlog ... Browse Code »

Rename the XFS log structure to xlog to help crash distinquish it from the
other logs in Linux.

Signed-off-by: Mark Tinguely
Reviewed-by: Christoph Hellwig
Signed-off-by: Ben Myers

Mark Tinguely
2012-06-22 02:49:39 +0800

30 May, 2012

1 commit

77ba78776 xfs: switch to proper __bitwise type for KM_... flags ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-05-30 11:28:32 +0800

15 May, 2012

1 commit

4c2d542f2 xfs: Do background CIL flushes via a workqueue ... Browse Code »

Doing background CIL flushes adds significant latency to whatever
async transaction that triggers it. To avoid blocking async
transactions on things like waiting for log buffer IO to complete,
move the CIL push off into a workqueue. By moving the push work
into a workqueue, we remove all the latency that the commit adds
from the foreground transaction commit path. This also means that
single threaded workloads won't do the CIL push procssing, leaving
them more CPU to do more async transactions.

To do this, we need to keep track of the sequence number we have
pushed work for. This avoids having many transaction commits
attempting to schedule work for the same sequence, and ensures that
we only ever have one push (background or forced) in progress at a
time. It also means that we don't need to take the CIL lock in write
mode to check for potential background push races, which reduces
lock contention.

To avoid potential issues with "smart" IO schedulers, don't use the
workqueue for log force triggered flushes. Instead, do them directly
so that the log IO is done directly by the process issuing the log
force and so doesn't get stuck on IO elevator queue idling
incorrectly delaying the log IO from the workqueue.

Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:20:34 +0800

23 Feb, 2012

4 commits

9006fb91c xfs: split and cleanup xfs_log_reserve ... Browse Code »

Split the log regrant case out of xfs_log_reserve into a separate function,
and merge xlog_grant_log_space and xlog_regrant_write_log_space into their
respective callers. Also replace the XFS_LOG_PERM_RESERV flag, which easily
got misused before the previous cleanups with a simple boolean parameter.

Signed-off-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Reviewed-by: Dave Chinner
Signed-off-by: Ben Myers

Christoph Hellwig
2012-02-23 12:37:04 +0800
28496968a xfs: add the xlog_grant_head structure ... Browse Code »

Add a new data structure to allow sharing code between the log grant and
regrant code.

Reviewed-by: Mark Tinguely
Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Ben Myers

Christoph Hellwig
2012-02-23 12:19:53 +0800
14a7235fb xfs: remove log space waitqueues ... Browse Code »

The tic->t_wait waitqueues can never have more than a single waiter
on them, so we can easily replace them with a task_struct pointer
and wake_up_process.

Reviewed-by: Mark Tinguely
Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Ben Myers

Christoph Hellwig
2012-02-23 12:17:00 +0800
09a423a3d xfs: split tail_lsn assignments from log space wakeups ... Browse Code »

Currently xfs_log_move_tail has a tail_lsn argument that is horribly
overloaded: it may contain either an actual lsn to assign to the log tail,
0 as a special case to use the last sync LSN, or 1 to indicate that no tail
LSN assignment should be performed, and we should opportunisticly wake up
at one task waiting for log space even if we did not move the LSN.

Remove the tail lsn assigned from xfs_log_move_tail and make the two callers
use xlog_assign_tail_lsn instead of the current variant of partially using
the code in xfs_log_move_tail and partially opencoding it. Note that means
we grow an addition lock roundtrip on the AIL lock for each bulk update
or delete, which is still far less than what we had before introducing the
bulk operations. If this proves to be a problem we can still add a variant
of xlog_assign_tail_lsn that expects the lock to be held already.

Also rename the remainder of xfs_log_move_tail to xfs_log_space_wake as
that name describes its functionality much better.

Reviewed-by: Mark Tinguely
Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Ben Myers

Christoph Hellwig
2012-02-23 12:17:00 +0800

29 Apr, 2011

1 commit

97d3ac75e xfs: exact busy extent tracking ... Browse Code »

Update the extent tree in case we have to reuse a busy extent, so that it
always is kept uptodate. This is done by replacing the busy list searches
with a new xfs_alloc_busy_reuse helper, which updates the busy extent tree
in case of a reuse. This allows us to allow reusing metadata extents
unconditionally, and thus avoid log forces especially for allocation btree
blocks.

Signed-off-by: Christoph Hellwig
Signed-off-by: Alex Elder

Christoph Hellwig
2011-04-29 02:18:04 +0800

12 Apr, 2011

1 commit

1e05ff020 Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs ... Browse Code »

* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: use proper interfaces for on-stack plugging
xfs: fix xfs_debug warnings
xfs: fix variable set but not used warnings
xfs: convert log tail checking to a warning
xfs: catch bad block numbers freeing extents.
xfs: push the AIL from memory reclaim and periodic sync
xfs: clean up code layout in xfs_trans_ail.c
xfs: convert the xfsaild threads to a workqueue
xfs: introduce background inode reclaim work
xfs: convert ENOSPC inode flushing to use new syncd workqueue
xfs: introduce a xfssyncd workqueue
xfs: fix extent format buffer allocation size
xfs: fix unreferenced var error in xfs_buf.c

Also, applied patch from Tony Luck that fixes ia64:
xfs_destroy_workqueues() should not be tagged with__exit
in the branch before merging.

Linus Torvalds
2011-04-12 06:48:57 +0800

08 Apr, 2011

1 commit

da8a1a4a4 xfs: convert log tail checking to a warning ... Browse Code »

On the Power platform, the log tail debug checks fire excessively
causing the system to panic early in testing. The debug checks are
known to be racy, though on x86_64 there is no evidence that they
trigger at all.

We want to keep the checks active on debug systems to alert us to
problems with log space accounting, but we need to reduce the impact
of a racy check on testing on the Power platform.

As a result, convert the ASSERT conditions to warnings, and
allow them to fire only once per filesystem mount. This will prevent
false positives from interfering with testing, whilst still
providing us with the indication that they may be a problem with log
space accounting should that occur.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Alex Elder

Dave Chinner
2011-04-08 10:45:07 +0800

31 Mar, 2011

1 commit

25985edce Fix common misspellings ... Browse Code »

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi

Lucas De Marchi
2011-03-31 22:26:23 +0800

07 Mar, 2011

1 commit

a0fa2b679 xfs: Convert xlog_warn to new logging interface ... Browse Code »

Convert the xfs log operations to use the new error logging
interfaces. This removes the xlog_{warn,panic} wrappers and makes
almost all errors emit the device they belong to instead of just
refering to "XFS".

Signed-off-by: Dave Chinner
Reviewed-by: Alex Elder
Reviewed-by: Christoph Hellwig

Dave Chinner
2011-03-07 07:01:35 +0800

21 Dec, 2010

6 commits

d0eb2f38b xfs: convert grant head manipulations to lockless algorithm ... Browse Code »

The only thing that the grant lock remains to protect is the grant head
manipulations when adding or removing space from the log. These calculations
are already based on atomic variables, so we can already update them safely
without locks. However, the grant head manpulations require atomic multi-step
calculations to be executed, which the algorithms currently don't allow.

To make these multi-step calculations atomic, convert the algorithms to
compare-and-exchange loops on the atomic variables. That is, we sample the old
value, perform the calculation and use atomic64_cmpxchg() to attempt to update
the head with the new value. If the head has not changed since we sampled it,
it will succeed and we are done. Otherwise, we rerun the calculation again from
a new sample of the head.

This allows us to remove the grant lock from around all the grant head space
manipulations, and that effectively removes the grant lock from the log
completely. Hence we can remove the grant lock completely from the log at this
point.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig

Dave Chinner
2010-12-21 09:29:14 +0800
3f16b9850 xfs: introduce new locks for the log grant ticket wait queues ... Browse Code »

The log grant ticket wait queues are currently protected by the log
grant lock. However, the queues are functionally independent from
each other, and operations on them only require serialisation
against other queue operations now that all of the other log
variables they use are atomic values.

Hence, we can make them independent of the grant lock by introducing
new locks just to protect the lists operations. because the lists
are independent, we can use a lock per list and ensure that reserve
and write head queuing do not contend.

To ensure forced shutdowns work correctly in conjunction with the
new fast paths, ensure that we check whether the log has been shut
down in the grant functions once we hold the relevant spin locks but
before we go to sleep. This is needed to co-ordinate correctly with
the wakeups that are issued on the ticket queues so we don't leave
any processes sleeping on the queues during a shutdown.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig

Dave Chinner
2010-12-21 09:29:01 +0800
1c3cb9ec0 xfs: convert l_tail_lsn to an atomic variable. ... Browse Code »

log->l_tail_lsn is currently protected by the log grant lock. The
lock is only needed for serialising readers against writers, so we
don't really need the lock if we make the l_tail_lsn variable an
atomic. Converting the l_tail_lsn variable to an atomic64_t means we
can start to peel back the grant lock from various operations.

Also, provide functions to safely crack an atomic LSN variable into
it's component pieces and to recombined the components into an
atomic variable. Use them where appropriate.

This also removes the need for explicitly holding a spinlock to read
the l_tail_lsn on 32 bit platforms.

Signed-off-by: Dave Chinner

Dave Chinner
2010-12-21 09:28:39 +0800
eb40a8750 xfs: use wait queues directly for the log wait queues ... Browse Code »

The log grant queues are one of the few places left using sv_t
constructs for waiting. Given we are touching this code, we should
convert them to plain wait queues. While there, convert all the
other sv_t users in the log code as well.

Seeing as this removes the last users of the sv_t type, remove the
header file defining the wrapper and the fragments that still
reference it.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig

Dave Chinner
2010-12-21 09:09:01 +0800
a69ed03c2 xfs: combine grant heads into a single 64 bit integer ... Browse Code »

Prepare for switching the grant heads to atomic variables by
combining the two 32 bit values that make up the grant head into a
single 64 bit variable. Provide wrapper functions to combine and
split the grant heads appropriately for calculations and use them as
necessary.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig

Dave Chinner
2010-12-21 09:08:20 +0800
105479419 xfs: convert log grant ticket queues to list heads ... Browse Code »

The grant write and reserve queues use a roll-your-own double linked
list, so convert it to a standard list_head structure and convert
all the list traversals to use list_for_each_entry(). We can also
get rid of the XLOG_TIC_IN_Q flag as we can use the list_empty()
check to tell if the ticket is in a list or not.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig

Dave Chinner
2010-12-21 09:02:25 +0800

17 Dec, 2010

1 commit

d5689eaa0 xfs: use struct list_head for the buf cancel table ... Browse Code »

Signed-off-by: Christoph Hellwig
Signed-off-by: Alex Elder

Christoph Hellwig
2010-12-17 06:05:22 +0800

03 Dec, 2010

2 commits

c8a09ff8c xfs: convert log grant heads to atomic variables ... Browse Code »

Convert the log grant heads to atomic64_t types in preparation for
converting the accounting algorithms to atomic operations. his patch
just converts the variables; the algorithmic changes are in a
separate patch for clarity.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig

Dave Chinner
2010-12-03 21:02:40 +0800
84f3c683c xfs: convert l_last_sync_lsn to an atomic variable ... Browse Code »

log->l_last_sync_lsn is updated in only one critical spot - log
buffer Io completion - and is protected by the grant lock here. This
requires the grant lock to be taken for every log buffer IO
completion. Converting the l_last_sync_lsn variable to an atomic64_t
means that we do not need to take the grant lock in log buffer IO
completion to update it.

This also removes the need for explicitly holding a spinlock to read
the l_last_sync_lsn on 32 bit platforms.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig

Dave Chinner
2010-12-03 19:11:29 +0800

29 Sep, 2010

1 commit

80168676e xfs: force background CIL push under sustained load ... Browse Code »

I have been seeing occasional pauses in transaction throughput up to
30s long under heavy parallel workloads. The only notable thing was
that the xfsaild was trying to be active during the pauses, but
making no progress. It was running exactly 20 times a second (on the
50ms no-progress backoff), and the number of pushbuf events was
constant across this time as well. IOWs, the xfsaild appeared to be
stuck on buffers that it could not push out.

Further investigation indicated that it was trying to push out inode
buffers that were pinned and/or locked. The xfsbufd was also getting
woken at the same frequency (by the xfsaild, no doubt) to push out
delayed write buffers. The xfsbufd was not making any progress
because all the buffers in the delwri queue were pinned. This scan-
and-make-no-progress dance went one in the trace for some seconds,
before the xfssyncd came along an issued a log force, and then
things started going again.

However, I noticed something strange about the log force - there
were way too many IO's issued. 516 log buffers were written, to be
exact. That added up to 129MB of log IO, which got me very
interested because it's almost exactly 25% of the size of the log.
He delayed logging code is suppose to aggregate the minimum of 25%
of the log or 8MB worth of changes before flushing. That's what
really puzzled me - why did a log force write 129MB instead of only
8MB?

Essentially what has happened is that no CIL pushes had occurred
since the previous tail push which cleared out 25% of the log space.
That caused all the new transactions to block because there wasn't
log space for them, but they kick the xfsaild to push the tail.
However, the xfsaild was not making progress because there were
buffers it could not lock and flush, and the xfsbufd could not flush
them because they were pinned. As a result, both the xfsaild and the
xfsbufd could not move the tail of the log forward without the CIL
first committing.

The cause of the problem was that the background CIL push, which
should happen when 8MB of aggregated changes have been committed, is
being held off by the concurrent transaction commit load. The
background push does a down_write_trylock() which will fail if there
is a concurrent transaction commit holding the push lock in read
mode. With 8 CPUs all doing transactions as fast as they can, there
was enough concurrent transaction commits to hold off the background
push until tail-pushing could no longer free log space, and the halt
would occur.

It should be noted that there is no reason why it would halt at 25%
of log space used by a single CIL checkpoint. This bug could
definitely violate the "no transaction should be larger than half
the log" requirement and hence result in corruption if the system
crashed under heavy load. This sort of bug is exactly the reason why
delayed logging was tagged as experimental....

The fix is to start blocking background pushes once the threshold
has been exceeded. Rework the threshold calculations to keep the
amount of log space a CIL checkpoint can use to below that of the
AIL push threshold to avoid the problem completely.

Signed-off-by: Dave Chinner
Reviewed-by: Alex Elder
Reviewed-by: Christoph Hellwig

Dave Chinner
2010-09-29 20:51:03 +0800

24 Aug, 2010

1 commit

a44f13edf xfs: Reduce log force overhead for delayed logging ... Browse Code »

Delayed logging adds some serialisation to the log force process to
ensure that it does not deference a bad commit context structure
when determining if a CIL push is necessary or not. It does this by
grabing the CIL context lock exclusively, then dropping it before
pushing the CIL if necessary. This causes serialisation of all log
forces and pushes regardless of whether a force is necessary or not.
As a result fsync heavy workloads (like dbench) can be significantly
slower with delayed logging than without.

To avoid this penalty, copy the current sequence from the context to
the CIL structure when they are swapped. This allows us to do
unlocked checks on the current sequence without having to worry
about dereferencing context structures that may have already been
freed. Hence we can remove the CIL context locking in the forcing
code and only call into the push code if the current context matches
the sequence we need to force.

By passing the sequence into the push code, we can check the
sequence again once we have the CIL lock held exclusive and abort if
the sequence has already been pushed. This avoids a lock round-trip
and unnecessary CIL pushes when we have racing push calls.

The result is that the regression in dbench performance goes away -
this change improves dbench performance on a ramdisk from ~2100MB/s
to ~2500MB/s. This compares favourably to not using delayed logging
which retuns ~2500MB/s for the same workload.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig

Dave Chinner
2010-08-24 09:40:03 +0800

24 May, 2010

3 commits

df806158b xfs: enable background pushing of the CIL ... Browse Code »

If we let the CIL grow without bound, it will grow large enough to violate
recovery constraints (must be at least one complete transaction in the log at
all times) or take forever to write out through the log buffers. Hence we need
a check during asynchronous transactions as to whether the CIL needs to be
pushed.

We track the amount of log space the CIL consumes, so it is relatively simple
to limit it on a pure size basis. Make the limit the minimum of just under half
the log size (recovery constraint) or 8MB of log space (which is an awful lot
of metadata).

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Signed-off-by: Alex Elder

Dave Chinner
2010-05-24 23:38:20 +0800
71e330b59 xfs: Introduce delayed logging core code ... Browse Code »

The delayed logging code only changes in-memory structures and as
such can be enabled and disabled with a mount option. Add the mount
option and emit a warning that this is an experimental feature that
should not be used in production yet.

We also need infrastructure to track committed items that have not
yet been written to the log. This is what the Committed Item List
(CIL) is for.

The log item also needs to be extended to track the current log
vector, the associated memory buffer and it's location in the Commit
Item List. Extend the log item and log vector structures to enable
this tracking.

To maintain the current log format for transactions with delayed
logging, we need to introduce a checkpoint transaction and a context
for tracking each checkpoint from initiation to transaction
completion. This includes adding a log ticket for tracking space
log required/used by the context checkpoint.

To track all the changes we need an io vector array per log item,
rather than a single array for the entire transaction. Using the new
log vector structure for this requires two passes - the first to
allocate the log vector structures and chain them together, and the
second to fill them out. This log vector chain can then be passed
to the CIL for formatting, pinning and insertion into the CIL.

Formatting of the log vector chain is relatively simple - it's just
a loop over the iovecs on each log vector, but it is made slightly
more complex because we re-write the iovec after the copy to point
back at the memory buffer we just copied into.

This code also needs to pin log items. If the log item is not
already tracked in this checkpoint context, then it needs to be
pinned. Otherwise it is already pinned and we don't need to pin it
again.

The only other complexity is calculating the amount of new log space
the formatting has consumed. This needs to be accounted to the
transaction in progress, and the accounting is made more complex
becase we need also to steal space from it for log metadata in the
checkpoint transaction. Calculate all this at insert time and update
all the tickets, counters, etc correctly.

Once we've formatted all the log items in the transaction, attach
the busy extents to the checkpoint context so the busy extents live
until checkpoint completion and can be processed at that point in
time. Transactions can then be freed at this point in time.

Now we need to issue checkpoints - we are tracking the amount of log space
used by the items in the CIL, so we can trigger background checkpoints when the
space usage gets to a certain threshold. Otherwise, checkpoints need ot be
triggered when a log synchronisation point is reached - a log force event.

Because the log write code already handles chained log vectors, writing the
transaction is trivial, too. Construct a transaction header, add it
to the head of the chain and write it into the log, then issue a
commit record write. Then we can release the checkpoint log ticket
and attach the context to the log buffer so it can be called during
Io completion to complete the checkpoint.

We also need to allow for synchronising multiple in-flight
checkpoints. This is needed for two things - the first is to ensure
that checkpoint commit records appear in the log in the correct
sequence order (so they are replayed in the correct order). The
second is so that xfs_log_force_lsn() operates correctly and only
flushes and/or waits for the specific sequence it was provided with.

To do this we need a wait variable and a list tracking the
checkpoint commits in progress. We can walk this list and wait for
the checkpoints to change state or complete easily, an this provides
the necessary synchronisation for correct operation in both cases.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Signed-off-by: Alex Elder

Dave Chinner
2010-05-24 23:38:03 +0800
955833cf2 xfs: make the log ticket ID available outside the log infrastructure ... Browse Code »

The ticket ID is needed to uniquely identify transactions when doing busy
extent matching. Delayed logging changes the lifecycle of busy extents with
respect to the transaction structure lifecycle. Hence we can no longer use
the transaction structure as a means of determining the owner of the busy
extent as it may be freed and reused while the busy extent is still active.

This commit provides the infrastructure to access the xlog_tid_t held in the
ticket from a transaction handle. This avoids the need for callers to peek
into the transaction and log structures to find this out.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Signed-off-by: Alex Elder

Dave Chinner
2010-05-24 23:33:52 +0800

19 May, 2010

3 commits

48389ef17 xfs: kill off l_sectbb_mask ... Browse Code »

There remains only one user of the l_sectbb_mask field in the log
structure. Just kill it off and compute the mask where needed from
the power-of-2 sector size.

(Only update from last post is to accomodate the changes in the
previous patch in the series.)

Signed-off-by: Alex Elder
Reviewed-by: Christoph Hellwig

Alex Elder
2010-05-19 22:58:16 +0800
69ce58f08 xfs: record log sector size rather than log2(that) ... Browse Code »

Change struct log so it keeps track of the size (in basic blocks) of
a log sector in l_sectBBsize rather than the log-base-2 of that
value (previously, l_sectbb_log). The name was chosen for
consistency with the other fields in the structure that represent
a number of basic blocks.

(Updated so that a variable used in computing and verifying a log's
sector size is named "log2_size". Also added the "BB" to the
structure field name, based on feedback from Eric Sandeen. Also
dropped some superfluous parentheses.)

Signed-off-by: Alex Elder
Reviewed-by: Eric Sandeen

Alex Elder
2010-05-19 22:58:15 +0800
e6b1f2737 xfs: clean up xlog_write_adv_cnt ... Browse Code »

Replace the awkward xlog_write_adv_cnt with an inline helper that makes
it more obvious that it's modifying it's paramters, and replace the use
of an integer type for "ptr" with a real void pointer. Also move
xlog_write_adv_cnt to xfs_log_priv.h as it will be used outside of
xfs_log.c in the delayed logging series.

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner

Christoph Hellwig
2010-05-19 22:58:10 +0800

16 Jan, 2010

1 commit

5d77c0dc0 xfs: make several more functions static ... Browse Code »

Just minor housekeeping, a lot more functions can be trivially made
static; others could if we reordered things a bit...

Signed-off-by: Eric Sandeen
Signed-off-by: Alex Elder

Eric Sandeen
2010-01-16 05:31:38 +0800

15 Dec, 2009

1 commit

0b1b213fc xfs: event tracing support ... Browse Code »

Convert the old xfs tracing support that could only be used with the
out of tree kdb and xfsidbg patches to use the generic event tracer.

To use it make sure CONFIG_EVENT_TRACING is enabled and then enable
all xfs trace channels by:

echo 1 > /sys/kernel/debug/tracing/events/xfs/enable

or alternatively enable single events by just doing the same in one
event subdirectory, e.g.

echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_ihold/enable

or set more complex filters, etc. In Documentation/trace/events.txt
all this is desctribed in more detail. To reads the events do a

cat /sys/kernel/debug/tracing/trace

Compared to the last posting this patch converts the tracing mostly to
the one tracepoint per callsite model that other users of the new
tracing facility also employ. This allows a very fine-grained control
of the tracing, a cleaner output of the traces and also enables the
perf tool to use each tracepoint as a virtual performance counter,
allowing us to e.g. count how often certain workloads git various
spots in XFS. Take a look at

http://lwn.net/Articles/346470/

for some examples.

Also the btree tracing isn't included at all yet, as it will require
additional core tracing features not in mainline yet, I plan to
deliver it later.

And the really nice thing about this patch is that it actually removes
many lines of code while adding this nice functionality:

fs/xfs/Makefile | 8
fs/xfs/linux-2.6/xfs_acl.c | 1
fs/xfs/linux-2.6/xfs_aops.c | 52 -
fs/xfs/linux-2.6/xfs_aops.h | 2
fs/xfs/linux-2.6/xfs_buf.c | 117 +--
fs/xfs/linux-2.6/xfs_buf.h | 33
fs/xfs/linux-2.6/xfs_fs_subr.c | 3
fs/xfs/linux-2.6/xfs_ioctl.c | 1
fs/xfs/linux-2.6/xfs_ioctl32.c | 1
fs/xfs/linux-2.6/xfs_iops.c | 1
fs/xfs/linux-2.6/xfs_linux.h | 1
fs/xfs/linux-2.6/xfs_lrw.c | 87 --
fs/xfs/linux-2.6/xfs_lrw.h | 45 -
fs/xfs/linux-2.6/xfs_super.c | 104 ---
fs/xfs/linux-2.6/xfs_super.h | 7
fs/xfs/linux-2.6/xfs_sync.c | 1
fs/xfs/linux-2.6/xfs_trace.c | 75 ++
fs/xfs/linux-2.6/xfs_trace.h | 1369 +++++++++++++++++++++++++++++++++++++++++
fs/xfs/linux-2.6/xfs_vnode.h | 4
fs/xfs/quota/xfs_dquot.c | 110 ---
fs/xfs/quota/xfs_dquot.h | 21
fs/xfs/quota/xfs_qm.c | 40 -
fs/xfs/quota/xfs_qm_syscalls.c | 4
fs/xfs/support/ktrace.c | 323 ---------
fs/xfs/support/ktrace.h | 85 --
fs/xfs/xfs.h | 16
fs/xfs/xfs_ag.h | 14
fs/xfs/xfs_alloc.c | 230 +-----
fs/xfs/xfs_alloc.h | 27
fs/xfs/xfs_alloc_btree.c | 1
fs/xfs/xfs_attr.c | 107 ---
fs/xfs/xfs_attr.h | 10
fs/xfs/xfs_attr_leaf.c | 14
fs/xfs/xfs_attr_sf.h | 40 -
fs/xfs/xfs_bmap.c | 507 +++------------
fs/xfs/xfs_bmap.h | 49 -
fs/xfs/xfs_bmap_btree.c | 6
fs/xfs/xfs_btree.c | 5
fs/xfs/xfs_btree_trace.h | 17
fs/xfs/xfs_buf_item.c | 87 --
fs/xfs/xfs_buf_item.h | 20
fs/xfs/xfs_da_btree.c | 3
fs/xfs/xfs_da_btree.h | 7
fs/xfs/xfs_dfrag.c | 2
fs/xfs/xfs_dir2.c | 8
fs/xfs/xfs_dir2_block.c | 20
fs/xfs/xfs_dir2_leaf.c | 21
fs/xfs/xfs_dir2_node.c | 27
fs/xfs/xfs_dir2_sf.c | 26
fs/xfs/xfs_dir2_trace.c | 216 ------
fs/xfs/xfs_dir2_trace.h | 72 --
fs/xfs/xfs_filestream.c | 8
fs/xfs/xfs_fsops.c | 2
fs/xfs/xfs_iget.c | 111 ---
fs/xfs/xfs_inode.c | 67 --
fs/xfs/xfs_inode.h | 76 --
fs/xfs/xfs_inode_item.c | 5
fs/xfs/xfs_iomap.c | 85 --
fs/xfs/xfs_iomap.h | 8
fs/xfs/xfs_log.c | 181 +----
fs/xfs/xfs_log_priv.h | 20
fs/xfs/xfs_log_recover.c | 1
fs/xfs/xfs_mount.c | 2
fs/xfs/xfs_quota.h | 8
fs/xfs/xfs_rename.c | 1
fs/xfs/xfs_rtalloc.c | 1
fs/xfs/xfs_rw.c | 3
fs/xfs/xfs_trans.h | 47 +
fs/xfs/xfs_trans_buf.c | 62 -
fs/xfs/xfs_vnodeops.c | 8
70 files changed, 2151 insertions(+), 2592 deletions(-)

Signed-off-by: Christoph Hellwig
Signed-off-by: Alex Elder

Christoph Hellwig
2009-12-15 13:08:16 +0800

01 Sep, 2009

1 commit

d96f8f891 xfs: add more statics & drop some unused functions ... Browse Code »

A lot more functions could be made static, but they need
forward declarations; this does some easy ones, and also
found a few unused functions in the process.

Signed-off-by: Eric Sandeen
Reviewed-by: Christoph Hellwig
Signed-off-by: Felix Blyakher

Eric Sandeen
2009-09-01 03:46:20 +0800

16 Mar, 2009

1 commit

076e6acb8 xfs: cleanup xlog_bread ... Browse Code »

Most callers of xlog_bread need to call xlog_align to get the actual offset.
Consolidate that call into the main xlog_bread and provide a _xlog_bread
for those few that don't want the actual offset.

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner

Christoph Hellwig
2009-03-16 15:24:13 +0800