14 Aug, 2013

1 commit

  • The xc_cil_lock is used for two purposes - to protect the CIL
    itself, and to protect the push/commit state and lists. These are
    two logically separate structures and operations, so can have their
    own locks. This means that pushing on the CIL and the commit wait
    ordering won't contend for a lock with other transactions that are
    completing concurrently. As the CIL insertion is the hottest path
    throught eh CIL, this is a big win.

    Signed-off-by: Dave Chinner
    Reviewed-by: Mark Tinguely
    Signed-off-by: Ben Myers

    Dave Chinner
     

13 Aug, 2013

1 commit

  • The on-disk format definitions for the log are spread randoms
    through a couple of header files. Consolidate it all in a single
    file that can be shared easily with userspace. This means that
    xfs_log.h and xfs_log_priv.h no longer need to be shared with
    userspace.

    Signed-off-by: Dave Chinner
    Reviewed-by: Brian Foster
    Reviewed-by: Mark Tinguely
    Signed-off-by: Ben Myers

    Dave Chinner
     

17 Apr, 2013

1 commit


04 Dec, 2012

1 commit

  • Not a bug as such, just warning noise from the xlog_cksum()
    returning a __be32 type when it should be returning a __le32 type.

    On Wed, Nov 28, 2012 at 08:30:59AM -0500, Christoph Hellwig wrote:
    > But why are we storing the crc field little endian while all other on
    > disk formats are big endian? (And yes I realize it might as well have
    > been me who did that back in the idea, but I still have no idea why)

    Because the CRC always returns the calcuation LE format, even on BE
    systems. So rather than always having to byte swap it everywhere and
    have all the force casts and anootations for sparse, it seems simpler to
    just make it a __le32 everywhere....

    Signed-off-by: Dave Chinner
    Reviewed-by: Ben Myers
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Mark Tinguely
    Signed-off-by: Ben Myers

    Dave Chinner
     

20 Nov, 2012

1 commit

  • Implement CRCs for the log buffers. We re-use a field in
    struct xlog_rec_header that was used for a weak checksum of the
    log buffer payload in debug builds before.

    The new checksumming uses the crc32c checksum we will use elsewhere
    in XFS, and also protects the record header and addition cycle data.

    Due to this there are some interesting changes in xlog_sync, as we
    need to do the cycle wrapping for the split buffer case much earlier,
    as we would touch the buffer after generating the checksum otherwise.

    The CRC calculation is always enabled, even for non-CRC filesystems,
    as adding this CRC does not change the log format. On non-CRC
    filesystems, only issue an alert if a CRC mismatch is found and
    allow recovery to continue - this will act as an indicator that
    log recovery problems are a result of log corruption. On CRC enabled
    filesystems, however, log recovery will fail.

    Note that existing debug kernels will write a simple checksum value
    to the log, so the first time this is run on a filesystem taht was
    last used on a debug kernel it will through CRC mismatch warning
    errors. These can be ignored.

    Initially based on a patch from Dave Chinner, then modified
    significantly by Christoph Hellwig. Modified again by Dave Chinner
    to get to this version.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Dave Chinner
    Reviewed-by: Mark Tinguely
    Signed-off-by: Ben Myers

    Christoph Hellwig
     

18 Oct, 2012

1 commit

  • The only thing the periodic sync work does now is flush the AIL and
    idle the log. These are really functions of the log code, so move
    the work to xfs_log.c and rename it appropriately.

    The only wart that this leaves behind is the xfssyncd_centisecs
    sysctl, otherwise the xfssyncd is dead. Clean up any comments that
    related to xfssyncd to reflect it's passing.

    Signed-off-by: Dave Chinner
    Reviewed-by: Mark Tinguely
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Ben Myers

    Dave Chinner
     

22 Jun, 2012

2 commits


30 May, 2012

1 commit


15 May, 2012

1 commit

  • Doing background CIL flushes adds significant latency to whatever
    async transaction that triggers it. To avoid blocking async
    transactions on things like waiting for log buffer IO to complete,
    move the CIL push off into a workqueue. By moving the push work
    into a workqueue, we remove all the latency that the commit adds
    from the foreground transaction commit path. This also means that
    single threaded workloads won't do the CIL push procssing, leaving
    them more CPU to do more async transactions.

    To do this, we need to keep track of the sequence number we have
    pushed work for. This avoids having many transaction commits
    attempting to schedule work for the same sequence, and ensures that
    we only ever have one push (background or forced) in progress at a
    time. It also means that we don't need to take the CIL lock in write
    mode to check for potential background push races, which reduces
    lock contention.

    To avoid potential issues with "smart" IO schedulers, don't use the
    workqueue for log force triggered flushes. Instead, do them directly
    so that the log IO is done directly by the process issuing the log
    force and so doesn't get stuck on IO elevator queue idling
    incorrectly delaying the log IO from the workqueue.

    Signed-off-by: Dave Chinner
    Reviewed-by: Mark Tinguely
    Signed-off-by: Ben Myers

    Dave Chinner
     

23 Feb, 2012

4 commits

  • Split the log regrant case out of xfs_log_reserve into a separate function,
    and merge xlog_grant_log_space and xlog_regrant_write_log_space into their
    respective callers. Also replace the XFS_LOG_PERM_RESERV flag, which easily
    got misused before the previous cleanups with a simple boolean parameter.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Mark Tinguely
    Reviewed-by: Dave Chinner
    Signed-off-by: Ben Myers

    Christoph Hellwig
     
  • Add a new data structure to allow sharing code between the log grant and
    regrant code.

    Reviewed-by: Mark Tinguely
    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Ben Myers

    Christoph Hellwig
     
  • The tic->t_wait waitqueues can never have more than a single waiter
    on them, so we can easily replace them with a task_struct pointer
    and wake_up_process.

    Reviewed-by: Mark Tinguely
    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Ben Myers

    Christoph Hellwig
     
  • Currently xfs_log_move_tail has a tail_lsn argument that is horribly
    overloaded: it may contain either an actual lsn to assign to the log tail,
    0 as a special case to use the last sync LSN, or 1 to indicate that no tail
    LSN assignment should be performed, and we should opportunisticly wake up
    at one task waiting for log space even if we did not move the LSN.

    Remove the tail lsn assigned from xfs_log_move_tail and make the two callers
    use xlog_assign_tail_lsn instead of the current variant of partially using
    the code in xfs_log_move_tail and partially opencoding it. Note that means
    we grow an addition lock roundtrip on the AIL lock for each bulk update
    or delete, which is still far less than what we had before introducing the
    bulk operations. If this proves to be a problem we can still add a variant
    of xlog_assign_tail_lsn that expects the lock to be held already.

    Also rename the remainder of xfs_log_move_tail to xfs_log_space_wake as
    that name describes its functionality much better.

    Reviewed-by: Mark Tinguely
    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Ben Myers

    Christoph Hellwig
     

29 Apr, 2011

1 commit

  • Update the extent tree in case we have to reuse a busy extent, so that it
    always is kept uptodate. This is done by replacing the busy list searches
    with a new xfs_alloc_busy_reuse helper, which updates the busy extent tree
    in case of a reuse. This allows us to allow reusing metadata extents
    unconditionally, and thus avoid log forces especially for allocation btree
    blocks.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

12 Apr, 2011

1 commit

  • * 'for-linus' of git://oss.sgi.com/xfs/xfs:
    xfs: use proper interfaces for on-stack plugging
    xfs: fix xfs_debug warnings
    xfs: fix variable set but not used warnings
    xfs: convert log tail checking to a warning
    xfs: catch bad block numbers freeing extents.
    xfs: push the AIL from memory reclaim and periodic sync
    xfs: clean up code layout in xfs_trans_ail.c
    xfs: convert the xfsaild threads to a workqueue
    xfs: introduce background inode reclaim work
    xfs: convert ENOSPC inode flushing to use new syncd workqueue
    xfs: introduce a xfssyncd workqueue
    xfs: fix extent format buffer allocation size
    xfs: fix unreferenced var error in xfs_buf.c

    Also, applied patch from Tony Luck that fixes ia64:
    xfs_destroy_workqueues() should not be tagged with__exit
    in the branch before merging.

    Linus Torvalds
     

08 Apr, 2011

1 commit

  • On the Power platform, the log tail debug checks fire excessively
    causing the system to panic early in testing. The debug checks are
    known to be racy, though on x86_64 there is no evidence that they
    trigger at all.

    We want to keep the checks active on debug systems to alert us to
    problems with log space accounting, but we need to reduce the impact
    of a racy check on testing on the Power platform.

    As a result, convert the ASSERT conditions to warnings, and
    allow them to fire only once per filesystem mount. This will prevent
    false positives from interfering with testing, whilst still
    providing us with the indication that they may be a problem with log
    space accounting should that occur.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Alex Elder

    Dave Chinner
     

31 Mar, 2011

1 commit


07 Mar, 2011

1 commit

  • Convert the xfs log operations to use the new error logging
    interfaces. This removes the xlog_{warn,panic} wrappers and makes
    almost all errors emit the device they belong to instead of just
    refering to "XFS".

    Signed-off-by: Dave Chinner
    Reviewed-by: Alex Elder
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     

21 Dec, 2010

6 commits

  • The only thing that the grant lock remains to protect is the grant head
    manipulations when adding or removing space from the log. These calculations
    are already based on atomic variables, so we can already update them safely
    without locks. However, the grant head manpulations require atomic multi-step
    calculations to be executed, which the algorithms currently don't allow.

    To make these multi-step calculations atomic, convert the algorithms to
    compare-and-exchange loops on the atomic variables. That is, we sample the old
    value, perform the calculation and use atomic64_cmpxchg() to attempt to update
    the head with the new value. If the head has not changed since we sampled it,
    it will succeed and we are done. Otherwise, we rerun the calculation again from
    a new sample of the head.

    This allows us to remove the grant lock from around all the grant head space
    manipulations, and that effectively removes the grant lock from the log
    completely. Hence we can remove the grant lock completely from the log at this
    point.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     
  • The log grant ticket wait queues are currently protected by the log
    grant lock. However, the queues are functionally independent from
    each other, and operations on them only require serialisation
    against other queue operations now that all of the other log
    variables they use are atomic values.

    Hence, we can make them independent of the grant lock by introducing
    new locks just to protect the lists operations. because the lists
    are independent, we can use a lock per list and ensure that reserve
    and write head queuing do not contend.

    To ensure forced shutdowns work correctly in conjunction with the
    new fast paths, ensure that we check whether the log has been shut
    down in the grant functions once we hold the relevant spin locks but
    before we go to sleep. This is needed to co-ordinate correctly with
    the wakeups that are issued on the ticket queues so we don't leave
    any processes sleeping on the queues during a shutdown.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     
  • log->l_tail_lsn is currently protected by the log grant lock. The
    lock is only needed for serialising readers against writers, so we
    don't really need the lock if we make the l_tail_lsn variable an
    atomic. Converting the l_tail_lsn variable to an atomic64_t means we
    can start to peel back the grant lock from various operations.

    Also, provide functions to safely crack an atomic LSN variable into
    it's component pieces and to recombined the components into an
    atomic variable. Use them where appropriate.

    This also removes the need for explicitly holding a spinlock to read
    the l_tail_lsn on 32 bit platforms.

    Signed-off-by: Dave Chinner

    Dave Chinner
     
  • The log grant queues are one of the few places left using sv_t
    constructs for waiting. Given we are touching this code, we should
    convert them to plain wait queues. While there, convert all the
    other sv_t users in the log code as well.

    Seeing as this removes the last users of the sv_t type, remove the
    header file defining the wrapper and the fragments that still
    reference it.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     
  • Prepare for switching the grant heads to atomic variables by
    combining the two 32 bit values that make up the grant head into a
    single 64 bit variable. Provide wrapper functions to combine and
    split the grant heads appropriately for calculations and use them as
    necessary.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     
  • The grant write and reserve queues use a roll-your-own double linked
    list, so convert it to a standard list_head structure and convert
    all the list traversals to use list_for_each_entry(). We can also
    get rid of the XLOG_TIC_IN_Q flag as we can use the list_empty()
    check to tell if the ticket is in a list or not.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     

17 Dec, 2010

1 commit


03 Dec, 2010

2 commits

  • Convert the log grant heads to atomic64_t types in preparation for
    converting the accounting algorithms to atomic operations. his patch
    just converts the variables; the algorithmic changes are in a
    separate patch for clarity.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     
  • log->l_last_sync_lsn is updated in only one critical spot - log
    buffer Io completion - and is protected by the grant lock here. This
    requires the grant lock to be taken for every log buffer IO
    completion. Converting the l_last_sync_lsn variable to an atomic64_t
    means that we do not need to take the grant lock in log buffer IO
    completion to update it.

    This also removes the need for explicitly holding a spinlock to read
    the l_last_sync_lsn on 32 bit platforms.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     

29 Sep, 2010

1 commit

  • I have been seeing occasional pauses in transaction throughput up to
    30s long under heavy parallel workloads. The only notable thing was
    that the xfsaild was trying to be active during the pauses, but
    making no progress. It was running exactly 20 times a second (on the
    50ms no-progress backoff), and the number of pushbuf events was
    constant across this time as well. IOWs, the xfsaild appeared to be
    stuck on buffers that it could not push out.

    Further investigation indicated that it was trying to push out inode
    buffers that were pinned and/or locked. The xfsbufd was also getting
    woken at the same frequency (by the xfsaild, no doubt) to push out
    delayed write buffers. The xfsbufd was not making any progress
    because all the buffers in the delwri queue were pinned. This scan-
    and-make-no-progress dance went one in the trace for some seconds,
    before the xfssyncd came along an issued a log force, and then
    things started going again.

    However, I noticed something strange about the log force - there
    were way too many IO's issued. 516 log buffers were written, to be
    exact. That added up to 129MB of log IO, which got me very
    interested because it's almost exactly 25% of the size of the log.
    He delayed logging code is suppose to aggregate the minimum of 25%
    of the log or 8MB worth of changes before flushing. That's what
    really puzzled me - why did a log force write 129MB instead of only
    8MB?

    Essentially what has happened is that no CIL pushes had occurred
    since the previous tail push which cleared out 25% of the log space.
    That caused all the new transactions to block because there wasn't
    log space for them, but they kick the xfsaild to push the tail.
    However, the xfsaild was not making progress because there were
    buffers it could not lock and flush, and the xfsbufd could not flush
    them because they were pinned. As a result, both the xfsaild and the
    xfsbufd could not move the tail of the log forward without the CIL
    first committing.

    The cause of the problem was that the background CIL push, which
    should happen when 8MB of aggregated changes have been committed, is
    being held off by the concurrent transaction commit load. The
    background push does a down_write_trylock() which will fail if there
    is a concurrent transaction commit holding the push lock in read
    mode. With 8 CPUs all doing transactions as fast as they can, there
    was enough concurrent transaction commits to hold off the background
    push until tail-pushing could no longer free log space, and the halt
    would occur.

    It should be noted that there is no reason why it would halt at 25%
    of log space used by a single CIL checkpoint. This bug could
    definitely violate the "no transaction should be larger than half
    the log" requirement and hence result in corruption if the system
    crashed under heavy load. This sort of bug is exactly the reason why
    delayed logging was tagged as experimental....

    The fix is to start blocking background pushes once the threshold
    has been exceeded. Rework the threshold calculations to keep the
    amount of log space a CIL checkpoint can use to below that of the
    AIL push threshold to avoid the problem completely.

    Signed-off-by: Dave Chinner
    Reviewed-by: Alex Elder
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     

24 Aug, 2010

1 commit

  • Delayed logging adds some serialisation to the log force process to
    ensure that it does not deference a bad commit context structure
    when determining if a CIL push is necessary or not. It does this by
    grabing the CIL context lock exclusively, then dropping it before
    pushing the CIL if necessary. This causes serialisation of all log
    forces and pushes regardless of whether a force is necessary or not.
    As a result fsync heavy workloads (like dbench) can be significantly
    slower with delayed logging than without.

    To avoid this penalty, copy the current sequence from the context to
    the CIL structure when they are swapped. This allows us to do
    unlocked checks on the current sequence without having to worry
    about dereferencing context structures that may have already been
    freed. Hence we can remove the CIL context locking in the forcing
    code and only call into the push code if the current context matches
    the sequence we need to force.

    By passing the sequence into the push code, we can check the
    sequence again once we have the CIL lock held exclusive and abort if
    the sequence has already been pushed. This avoids a lock round-trip
    and unnecessary CIL pushes when we have racing push calls.

    The result is that the regression in dbench performance goes away -
    this change improves dbench performance on a ramdisk from ~2100MB/s
    to ~2500MB/s. This compares favourably to not using delayed logging
    which retuns ~2500MB/s for the same workload.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     

24 May, 2010

3 commits

  • If we let the CIL grow without bound, it will grow large enough to violate
    recovery constraints (must be at least one complete transaction in the log at
    all times) or take forever to write out through the log buffers. Hence we need
    a check during asynchronous transactions as to whether the CIL needs to be
    pushed.

    We track the amount of log space the CIL consumes, so it is relatively simple
    to limit it on a pure size basis. Make the limit the minimum of just under half
    the log size (recovery constraint) or 8MB of log space (which is an awful lot
    of metadata).

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Dave Chinner
     
  • The delayed logging code only changes in-memory structures and as
    such can be enabled and disabled with a mount option. Add the mount
    option and emit a warning that this is an experimental feature that
    should not be used in production yet.

    We also need infrastructure to track committed items that have not
    yet been written to the log. This is what the Committed Item List
    (CIL) is for.

    The log item also needs to be extended to track the current log
    vector, the associated memory buffer and it's location in the Commit
    Item List. Extend the log item and log vector structures to enable
    this tracking.

    To maintain the current log format for transactions with delayed
    logging, we need to introduce a checkpoint transaction and a context
    for tracking each checkpoint from initiation to transaction
    completion. This includes adding a log ticket for tracking space
    log required/used by the context checkpoint.

    To track all the changes we need an io vector array per log item,
    rather than a single array for the entire transaction. Using the new
    log vector structure for this requires two passes - the first to
    allocate the log vector structures and chain them together, and the
    second to fill them out. This log vector chain can then be passed
    to the CIL for formatting, pinning and insertion into the CIL.

    Formatting of the log vector chain is relatively simple - it's just
    a loop over the iovecs on each log vector, but it is made slightly
    more complex because we re-write the iovec after the copy to point
    back at the memory buffer we just copied into.

    This code also needs to pin log items. If the log item is not
    already tracked in this checkpoint context, then it needs to be
    pinned. Otherwise it is already pinned and we don't need to pin it
    again.

    The only other complexity is calculating the amount of new log space
    the formatting has consumed. This needs to be accounted to the
    transaction in progress, and the accounting is made more complex
    becase we need also to steal space from it for log metadata in the
    checkpoint transaction. Calculate all this at insert time and update
    all the tickets, counters, etc correctly.

    Once we've formatted all the log items in the transaction, attach
    the busy extents to the checkpoint context so the busy extents live
    until checkpoint completion and can be processed at that point in
    time. Transactions can then be freed at this point in time.

    Now we need to issue checkpoints - we are tracking the amount of log space
    used by the items in the CIL, so we can trigger background checkpoints when the
    space usage gets to a certain threshold. Otherwise, checkpoints need ot be
    triggered when a log synchronisation point is reached - a log force event.

    Because the log write code already handles chained log vectors, writing the
    transaction is trivial, too. Construct a transaction header, add it
    to the head of the chain and write it into the log, then issue a
    commit record write. Then we can release the checkpoint log ticket
    and attach the context to the log buffer so it can be called during
    Io completion to complete the checkpoint.

    We also need to allow for synchronising multiple in-flight
    checkpoints. This is needed for two things - the first is to ensure
    that checkpoint commit records appear in the log in the correct
    sequence order (so they are replayed in the correct order). The
    second is so that xfs_log_force_lsn() operates correctly and only
    flushes and/or waits for the specific sequence it was provided with.

    To do this we need a wait variable and a list tracking the
    checkpoint commits in progress. We can walk this list and wait for
    the checkpoints to change state or complete easily, an this provides
    the necessary synchronisation for correct operation in both cases.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Dave Chinner
     
  • The ticket ID is needed to uniquely identify transactions when doing busy
    extent matching. Delayed logging changes the lifecycle of busy extents with
    respect to the transaction structure lifecycle. Hence we can no longer use
    the transaction structure as a means of determining the owner of the busy
    extent as it may be freed and reused while the busy extent is still active.

    This commit provides the infrastructure to access the xlog_tid_t held in the
    ticket from a transaction handle. This avoids the need for callers to peek
    into the transaction and log structures to find this out.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Dave Chinner
     

19 May, 2010

3 commits

  • There remains only one user of the l_sectbb_mask field in the log
    structure. Just kill it off and compute the mask where needed from
    the power-of-2 sector size.

    (Only update from last post is to accomodate the changes in the
    previous patch in the series.)

    Signed-off-by: Alex Elder
    Reviewed-by: Christoph Hellwig

    Alex Elder
     
  • Change struct log so it keeps track of the size (in basic blocks) of
    a log sector in l_sectBBsize rather than the log-base-2 of that
    value (previously, l_sectbb_log). The name was chosen for
    consistency with the other fields in the structure that represent
    a number of basic blocks.

    (Updated so that a variable used in computing and verifying a log's
    sector size is named "log2_size". Also added the "BB" to the
    structure field name, based on feedback from Eric Sandeen. Also
    dropped some superfluous parentheses.)

    Signed-off-by: Alex Elder
    Reviewed-by: Eric Sandeen

    Alex Elder
     
  • Replace the awkward xlog_write_adv_cnt with an inline helper that makes
    it more obvious that it's modifying it's paramters, and replace the use
    of an integer type for "ptr" with a real void pointer. Also move
    xlog_write_adv_cnt to xfs_log_priv.h as it will be used outside of
    xfs_log.c in the delayed logging series.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     

16 Jan, 2010

1 commit


15 Dec, 2009

1 commit

  • Convert the old xfs tracing support that could only be used with the
    out of tree kdb and xfsidbg patches to use the generic event tracer.

    To use it make sure CONFIG_EVENT_TRACING is enabled and then enable
    all xfs trace channels by:

    echo 1 > /sys/kernel/debug/tracing/events/xfs/enable

    or alternatively enable single events by just doing the same in one
    event subdirectory, e.g.

    echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_ihold/enable

    or set more complex filters, etc. In Documentation/trace/events.txt
    all this is desctribed in more detail. To reads the events do a

    cat /sys/kernel/debug/tracing/trace

    Compared to the last posting this patch converts the tracing mostly to
    the one tracepoint per callsite model that other users of the new
    tracing facility also employ. This allows a very fine-grained control
    of the tracing, a cleaner output of the traces and also enables the
    perf tool to use each tracepoint as a virtual performance counter,
    allowing us to e.g. count how often certain workloads git various
    spots in XFS. Take a look at

    http://lwn.net/Articles/346470/

    for some examples.

    Also the btree tracing isn't included at all yet, as it will require
    additional core tracing features not in mainline yet, I plan to
    deliver it later.

    And the really nice thing about this patch is that it actually removes
    many lines of code while adding this nice functionality:

    fs/xfs/Makefile | 8
    fs/xfs/linux-2.6/xfs_acl.c | 1
    fs/xfs/linux-2.6/xfs_aops.c | 52 -
    fs/xfs/linux-2.6/xfs_aops.h | 2
    fs/xfs/linux-2.6/xfs_buf.c | 117 +--
    fs/xfs/linux-2.6/xfs_buf.h | 33
    fs/xfs/linux-2.6/xfs_fs_subr.c | 3
    fs/xfs/linux-2.6/xfs_ioctl.c | 1
    fs/xfs/linux-2.6/xfs_ioctl32.c | 1
    fs/xfs/linux-2.6/xfs_iops.c | 1
    fs/xfs/linux-2.6/xfs_linux.h | 1
    fs/xfs/linux-2.6/xfs_lrw.c | 87 --
    fs/xfs/linux-2.6/xfs_lrw.h | 45 -
    fs/xfs/linux-2.6/xfs_super.c | 104 ---
    fs/xfs/linux-2.6/xfs_super.h | 7
    fs/xfs/linux-2.6/xfs_sync.c | 1
    fs/xfs/linux-2.6/xfs_trace.c | 75 ++
    fs/xfs/linux-2.6/xfs_trace.h | 1369 +++++++++++++++++++++++++++++++++++++++++
    fs/xfs/linux-2.6/xfs_vnode.h | 4
    fs/xfs/quota/xfs_dquot.c | 110 ---
    fs/xfs/quota/xfs_dquot.h | 21
    fs/xfs/quota/xfs_qm.c | 40 -
    fs/xfs/quota/xfs_qm_syscalls.c | 4
    fs/xfs/support/ktrace.c | 323 ---------
    fs/xfs/support/ktrace.h | 85 --
    fs/xfs/xfs.h | 16
    fs/xfs/xfs_ag.h | 14
    fs/xfs/xfs_alloc.c | 230 +-----
    fs/xfs/xfs_alloc.h | 27
    fs/xfs/xfs_alloc_btree.c | 1
    fs/xfs/xfs_attr.c | 107 ---
    fs/xfs/xfs_attr.h | 10
    fs/xfs/xfs_attr_leaf.c | 14
    fs/xfs/xfs_attr_sf.h | 40 -
    fs/xfs/xfs_bmap.c | 507 +++------------
    fs/xfs/xfs_bmap.h | 49 -
    fs/xfs/xfs_bmap_btree.c | 6
    fs/xfs/xfs_btree.c | 5
    fs/xfs/xfs_btree_trace.h | 17
    fs/xfs/xfs_buf_item.c | 87 --
    fs/xfs/xfs_buf_item.h | 20
    fs/xfs/xfs_da_btree.c | 3
    fs/xfs/xfs_da_btree.h | 7
    fs/xfs/xfs_dfrag.c | 2
    fs/xfs/xfs_dir2.c | 8
    fs/xfs/xfs_dir2_block.c | 20
    fs/xfs/xfs_dir2_leaf.c | 21
    fs/xfs/xfs_dir2_node.c | 27
    fs/xfs/xfs_dir2_sf.c | 26
    fs/xfs/xfs_dir2_trace.c | 216 ------
    fs/xfs/xfs_dir2_trace.h | 72 --
    fs/xfs/xfs_filestream.c | 8
    fs/xfs/xfs_fsops.c | 2
    fs/xfs/xfs_iget.c | 111 ---
    fs/xfs/xfs_inode.c | 67 --
    fs/xfs/xfs_inode.h | 76 --
    fs/xfs/xfs_inode_item.c | 5
    fs/xfs/xfs_iomap.c | 85 --
    fs/xfs/xfs_iomap.h | 8
    fs/xfs/xfs_log.c | 181 +----
    fs/xfs/xfs_log_priv.h | 20
    fs/xfs/xfs_log_recover.c | 1
    fs/xfs/xfs_mount.c | 2
    fs/xfs/xfs_quota.h | 8
    fs/xfs/xfs_rename.c | 1
    fs/xfs/xfs_rtalloc.c | 1
    fs/xfs/xfs_rw.c | 3
    fs/xfs/xfs_trans.h | 47 +
    fs/xfs/xfs_trans_buf.c | 62 -
    fs/xfs/xfs_vnodeops.c | 8
    70 files changed, 2151 insertions(+), 2592 deletions(-)

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

01 Sep, 2009

1 commit


16 Mar, 2009

1 commit

  • Most callers of xlog_bread need to call xlog_align to get the actual offset.
    Consolidate that call into the main xlog_bread and provide a _xlog_bread
    for those few that don't want the actual offset.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig