16 Nov, 2012

6 commits

  • To separate the verifiers from iodone functions and associate read
    and write verifiers at the same time, introduce a buffer verifier
    operations structure to the xfs_buf.

    This avoids the need for assigning the write verifier, clearing the
    iodone function and re-running ioend processing in the read
    verifier, and gets rid of the nasty "b_pre_io" name for the write
    verifier function pointer. If we ever need to, it will also be
    easier to add further content specific callbacks to a buffer with an
    ops structure in place.

    We also avoid needing to export verifier functions, instead we
    can simply export the ops structures for those that are needed
    outside the function they are defined in.

    This patch also fixes a directory block readahead verifier issue
    it exposed.

    This patch also adds ops callbacks to the inode/alloc btree blocks
    initialised by growfs. These will need more work before they will
    work with CRCs.

    Signed-off-by: Dave Chinner
    Reviewed-by: Phil White
    Signed-off-by: Ben Myers

    Dave Chinner
     
  • Metadata buffers that are read from disk have write verifiers
    already attached to them, but newly allocated buffers do not. Add
    appropriate write verifiers to all new metadata buffers.

    Signed-off-by: Dave Chinner
    Reviewed-by: Ben Myers
    Signed-off-by: Ben Myers

    Dave Chinner
     
  • These verifiers are essentially the same code as the read verifiers,
    but do not require ioend processing. Hence factor the read verifier
    functions and add a new write verifier wrapper that is used as the
    callback.

    This is done as one large patch for all verifiers rather than one
    patch per verifier as the change is largely mechanical. This
    includes hooking up the write verifier via the read verifier
    function.

    Hooking up the write verifier for buffers obtained via
    xfs_trans_get_buf() will be done in a separate patch as that touches
    code in many different places rather than just the verifier
    functions.

    Signed-off-by: Dave Chinner
    Reviewed-by: Mark Tinguely
    Signed-off-by: Ben Myers

    Dave Chinner
     
  • Signed-off-by: Dave Chinner
    Reviewed-by: Phil White
    Signed-off-by: Ben Myers

    Dave Chinner
     
  • Some reads are not converted yet because it isn't obvious ahead of
    time what the format of the block is going to be. Need to determine
    how to tell if the first block in the tree is a node or leaf format
    block. That will be done in later patches.

    Signed-off-by: Dave Chinner
    Reviewed-by: Phil White
    Signed-off-by: Ben Myers

    Dave Chinner
     
  • Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Phil White
    Signed-off-by: Ben Myers

    Dave Chinner
     

14 Nov, 2012

2 commits

  • Added when debugging recent attribute tree problems to more finely
    trace code execution through the maze of twisty passages that makes
    up the attr code.

    Signed-off-by: Dave Chinner
    Reviewed-by: Mark Tinguely
    Signed-off-by: Ben Myers

    Dave Chinner
     
  • In certain circumstances, a double split of an attribute tree is
    needed to insert or replace an attribute. In rare situations, this
    can go wrong, leaving the attribute tree corrupted. In this case,
    the attr being replaced is the last attr in a leaf node, and the
    replacement is larger so doesn't fit in the same leaf node.
    When we have the initial condition of a node format attribute
    btree with two leaves at index 1 and 2. Call them L1 and L2. The
    leaf L1 is completely full, there is not a single byte of free space
    in it. L2 is mostly empty. The attribute being replaced - call it X
    - is the last attribute in L1.

    The way an attribute replace is executed is that the replacement
    attribute - call it Y - is first inserted into the tree, but has an
    INCOMPLETE flag set on it so that list traversals ignore it. Once
    this transaction is committed, a second transaction it run to
    atomically mark Y as COMPLETE and X as INCOMPLETE, so that a
    traversal will now find Y and skip X. Once that transaction is
    committed, attribute X is then removed.

    So, the initial condition is:

    +--------+ +--------+
    | L1 | | L2 |
    | fwd: 2 |---->| fwd: 0 |
    | bwd: 0 || fwd: 3 |---->| fwd: 2 |---->| fwd: 0 |
    | bwd: 0 || fwd: 3 |---->| fwd: 2 |---->| fwd: 0 |
    | bwd: 0 |
    Reviewed-by: Mark Tinguely
    Signed-off-by: Ben Myers

    Dave Chinner
     

02 Jul, 2012

1 commit

  • The struct xfs_dabuf now only tracks a single xfs_buf and all the
    information it holds can be gained directly from the xfs_buf. Hence
    we can remove the struct dabuf and pass the xfs_buf around
    everywhere.

    Kill the struct dabuf and the associated infrastructure.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Ben Myers

    Dave Chinner
     

15 May, 2012

2 commits

  • Untangle the header file includes a bit by moving the definition of
    xfs_agino_t to xfs_types.h. This removes the dependency that xfs_ag.h has on
    xfs_inum.h, meaning we don't need to include xfs_inum.h everywhere we include
    xfs_ag.h.

    Signed-off-by: Dave Chinner
    Reviewed-by: Mark Tinguely
    Signed-off-by: Ben Myers

    Dave Chinner
     
  • Buffers are always returned locked from the lookup routines. Hence
    we don't need to tell the lookup routines to return locked buffers,
    on to try and lock them. Remove XBF_LOCK from all the callers and
    from internal buffer cache usage.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Mark Tinguely
    Signed-off-by: Ben Myers

    Dave Chinner
     

28 Mar, 2012

1 commit


18 Jan, 2012

1 commit

  • We spent a lot of effort to maintain this field, but it always equals to the
    fork size divided by the constant size of an extent. The prime use of it is
    to assert that the two stay in sync. Just divide the fork size by the extent
    size in the few places that we actually use it and remove the overhead
    of maintaining it. Also introduce a few helpers to consolidate the places
    where we actually care about the value.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Ben Myers

    Christoph Hellwig
     

30 Nov, 2011

1 commit

  • With Dmitry fsstress updates I've seen very reproducible crashes in
    xfs_attr_shortform_remove because xfs_attr_shortform_bytesfit claims that
    the attributes would not fit inline into the inode after removing an
    attribute. It turns out that we were operating on an inode with lots
    of delalloc extents, and thus an if_bytes values for the data fork that
    is larger than biggest possible on-disk storage for it which utterly
    confuses the code near the end of xfs_attr_shortform_bytesfit.

    Fix this by always allowing the current attribute fork, like we already
    do for the attr1 format, given that delalloc conversion will take care
    for moving either the data or attribute area out of line if it doesn't
    fit at that point - or making the point moot by merging extents at this
    point.

    Also document the function better, and clean up some loose bits.

    Reviewed-by: Dave Chinner
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Ben Myers

    Christoph Hellwig
     

12 Oct, 2011

2 commits

  • xfs_bmapi() currently handles both extent map reading and
    allocation. As a result, the code is littered with "if (wr)"
    branches to conditionally do allocation operations if required.
    This makes the code much harder to follow and causes significant
    indent issues with the code.

    Given that read mapping is much simpler than allocation, we can
    split out read mapping from xfs_bmapi() and reuse the logic that
    we have already factored out do do all the hard work of handling the
    extent map manipulations. The results in a much simpler function for
    the common extent read operations, and will allow the allocation
    code to be simplified in another commit.

    Once xfs_bmapi_read() is implemented, convert all the callers of
    xfs_bmapi() that are only reading extents to use the new function.

    Signed-off-by: Dave Chinner
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Dave Chinner
     
  • Check the return value of xfs_trans_get_buf() and fail
    appropriately.

    Signed-off-by: Chandra Seetharaman
    Signed-off-by: Alex Elder

    Chandra Seetharaman
     

08 Jul, 2011

1 commit


23 Dec, 2010

1 commit

  • When listing attributes, we are doiing memory allocations under the
    inode ilock using only KM_SLEEP. This allows memory allocation to
    recurse back into the filesystem and do writeback, which may the
    ilock we already hold on the current inode. THis will deadlock.
    Hence use KM_NOFS for such allocations outside of transaction
    context to ensure that reclaim recursion does not occur.

    Reported-by: Nick Piggin
    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     

27 Jul, 2010

3 commits

  • This code was introduced four years ago in commit
    3e57ecf640428c01ba1ed8c8fc538447ada1715b without any review and has
    been unused since. Remove it just as the rest of the code introduced
    in that commit to reduce that stack usage and complexity in this central
    piece of code.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     
  • Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     
  • Dmapi support was never merged upstream, but we still have a lot of hooks
    bloating XFS for it, all over the fast pathes of the filesystem.

    This patch drops over 700 lines of dmapi overhead. If we'll ever get HSM
    support in mainline at least the namespace events can be done much saner
    in the VFS instead of the individual filesystem, so it's not like this
    is much help for future work.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     

22 Jan, 2010

1 commit

  • Currently we define aliases for the buffer flags in various
    namespaces, which only adds confusion. Remove all but the XBF_
    flags to clean this up a bit.

    Note that we still abuse XFS_B_ASYNC/XBF_ASYNC for some non-buffer
    uses, but I'll clean that up later.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

20 Jan, 2010

1 commit

  • To be consistent with the directory code, the attr code should use
    unsigned names. Convert the names from the vfs at the highest level
    to unsigned, and ænsure they are consistenly used as unsigned down
    to disk.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     

15 Dec, 2009

1 commit

  • Convert the old xfs tracing support that could only be used with the
    out of tree kdb and xfsidbg patches to use the generic event tracer.

    To use it make sure CONFIG_EVENT_TRACING is enabled and then enable
    all xfs trace channels by:

    echo 1 > /sys/kernel/debug/tracing/events/xfs/enable

    or alternatively enable single events by just doing the same in one
    event subdirectory, e.g.

    echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_ihold/enable

    or set more complex filters, etc. In Documentation/trace/events.txt
    all this is desctribed in more detail. To reads the events do a

    cat /sys/kernel/debug/tracing/trace

    Compared to the last posting this patch converts the tracing mostly to
    the one tracepoint per callsite model that other users of the new
    tracing facility also employ. This allows a very fine-grained control
    of the tracing, a cleaner output of the traces and also enables the
    perf tool to use each tracepoint as a virtual performance counter,
    allowing us to e.g. count how often certain workloads git various
    spots in XFS. Take a look at

    http://lwn.net/Articles/346470/

    for some examples.

    Also the btree tracing isn't included at all yet, as it will require
    additional core tracing features not in mainline yet, I plan to
    deliver it later.

    And the really nice thing about this patch is that it actually removes
    many lines of code while adding this nice functionality:

    fs/xfs/Makefile | 8
    fs/xfs/linux-2.6/xfs_acl.c | 1
    fs/xfs/linux-2.6/xfs_aops.c | 52 -
    fs/xfs/linux-2.6/xfs_aops.h | 2
    fs/xfs/linux-2.6/xfs_buf.c | 117 +--
    fs/xfs/linux-2.6/xfs_buf.h | 33
    fs/xfs/linux-2.6/xfs_fs_subr.c | 3
    fs/xfs/linux-2.6/xfs_ioctl.c | 1
    fs/xfs/linux-2.6/xfs_ioctl32.c | 1
    fs/xfs/linux-2.6/xfs_iops.c | 1
    fs/xfs/linux-2.6/xfs_linux.h | 1
    fs/xfs/linux-2.6/xfs_lrw.c | 87 --
    fs/xfs/linux-2.6/xfs_lrw.h | 45 -
    fs/xfs/linux-2.6/xfs_super.c | 104 ---
    fs/xfs/linux-2.6/xfs_super.h | 7
    fs/xfs/linux-2.6/xfs_sync.c | 1
    fs/xfs/linux-2.6/xfs_trace.c | 75 ++
    fs/xfs/linux-2.6/xfs_trace.h | 1369 +++++++++++++++++++++++++++++++++++++++++
    fs/xfs/linux-2.6/xfs_vnode.h | 4
    fs/xfs/quota/xfs_dquot.c | 110 ---
    fs/xfs/quota/xfs_dquot.h | 21
    fs/xfs/quota/xfs_qm.c | 40 -
    fs/xfs/quota/xfs_qm_syscalls.c | 4
    fs/xfs/support/ktrace.c | 323 ---------
    fs/xfs/support/ktrace.h | 85 --
    fs/xfs/xfs.h | 16
    fs/xfs/xfs_ag.h | 14
    fs/xfs/xfs_alloc.c | 230 +-----
    fs/xfs/xfs_alloc.h | 27
    fs/xfs/xfs_alloc_btree.c | 1
    fs/xfs/xfs_attr.c | 107 ---
    fs/xfs/xfs_attr.h | 10
    fs/xfs/xfs_attr_leaf.c | 14
    fs/xfs/xfs_attr_sf.h | 40 -
    fs/xfs/xfs_bmap.c | 507 +++------------
    fs/xfs/xfs_bmap.h | 49 -
    fs/xfs/xfs_bmap_btree.c | 6
    fs/xfs/xfs_btree.c | 5
    fs/xfs/xfs_btree_trace.h | 17
    fs/xfs/xfs_buf_item.c | 87 --
    fs/xfs/xfs_buf_item.h | 20
    fs/xfs/xfs_da_btree.c | 3
    fs/xfs/xfs_da_btree.h | 7
    fs/xfs/xfs_dfrag.c | 2
    fs/xfs/xfs_dir2.c | 8
    fs/xfs/xfs_dir2_block.c | 20
    fs/xfs/xfs_dir2_leaf.c | 21
    fs/xfs/xfs_dir2_node.c | 27
    fs/xfs/xfs_dir2_sf.c | 26
    fs/xfs/xfs_dir2_trace.c | 216 ------
    fs/xfs/xfs_dir2_trace.h | 72 --
    fs/xfs/xfs_filestream.c | 8
    fs/xfs/xfs_fsops.c | 2
    fs/xfs/xfs_iget.c | 111 ---
    fs/xfs/xfs_inode.c | 67 --
    fs/xfs/xfs_inode.h | 76 --
    fs/xfs/xfs_inode_item.c | 5
    fs/xfs/xfs_iomap.c | 85 --
    fs/xfs/xfs_iomap.h | 8
    fs/xfs/xfs_log.c | 181 +----
    fs/xfs/xfs_log_priv.h | 20
    fs/xfs/xfs_log_recover.c | 1
    fs/xfs/xfs_mount.c | 2
    fs/xfs/xfs_quota.h | 8
    fs/xfs/xfs_rename.c | 1
    fs/xfs/xfs_rtalloc.c | 1
    fs/xfs/xfs_rw.c | 3
    fs/xfs/xfs_trans.h | 47 +
    fs/xfs/xfs_trans_buf.c | 62 -
    fs/xfs/xfs_vnodeops.c | 8
    70 files changed, 2151 insertions(+), 2592 deletions(-)

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

12 Dec, 2009

1 commit

  • Remove our own STATIC_INLINE macro. For small function inside
    implementation files just use STATIC and let gcc inline it, and for
    those in headers do the normal static inline - they are all small
    enough to be inlined for debug builds, too.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

30 Mar, 2009

1 commit

  • With the upcoming v3 inodes the default attroffset needs to be calculated
    for each specific inode, so we can't cache it in the superblock anymore.

    Also replace the assert for wrong inode sizes with a proper error check
    also included in non-debug builds. Note that the ENOSYS return for
    that might seem odd, but that error is returned by xfs_mount_validate_sb
    for all theoretically valid but not supported filesystem geometries.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Josef 'Jeff' Sipek

    Christoph Hellwig
     

04 Feb, 2009

1 commit


09 Jan, 2009

1 commit


13 Aug, 2008

1 commit

  • Move it from the attr code to the transaction code and make
    the attr code call the new function.

    We rolltrans is really usefull whenever we want to use rolling
    transaction, should be generic, it isn't dependent on any part
    of the attr code anyway.

    We use this excuse to change all the:

    if ((error = xfs_attr_rolltrans()))

    calls into:

    error = xfs_trans_roll();

    if (error)

    SGI-PV: 981498

    SGI-Modid: xfs-linux-melb:xfs-kern:31729a

    Signed-off-by: Niv Sardi
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Lachlan McIlroy

    Niv Sardi
     

28 Jul, 2008

3 commits


18 Apr, 2008

1 commit

  • In the case where we mount a filesystem which was previously using the
    attr2 format as attr1, returning the default mp->m_attroffset instead of
    the per-inode di_forkoff for inline attribute fit calculations, may result
    in corruption, if for example, the data fork is already taking more space
    than the default fork offset and we try to add an extended attribute. Fix
    tested by xfstests/186.

    SGI-PV: 979606
    SGI-Modid: xfs-linux-melb:xfs-kern:30861a

    Signed-off-by: Eric Sandeen
    Signed-off-by: Tim Shimmin
    Signed-off-by: Lachlan McIlroy

    Eric Sandeen
     

10 Apr, 2008

1 commit


14 Feb, 2008

1 commit


07 Feb, 2008

1 commit

  • Un-obfuscate XFS_SB_LOCK, remove XFS_SB_LOCK->mutex_lock->spin_lock
    macros, call spin_lock directly, remove extraneous cookie holdover from
    old xfs code, and change lock type to spinlock_t.

    SGI-PV: 970382
    SGI-Modid: xfs-linux-melb:xfs-kern:29746a

    Signed-off-by: Eric Sandeen
    Signed-off-by: Donald Douwsma
    Signed-off-by: Tim Shimmin

    Eric Sandeen
     

08 May, 2007

1 commit


10 Feb, 2007

3 commits

  • SGI-PV: 960791
    SGI-Modid: xfs-linux-melb:xfs-kern:28021a

    Signed-off-by: Lachlan McIlroy
    Signed-off-by: Barry Naujok
    Signed-off-by: Tim Shimmin

    Lachlan McIlroy
     
  • SGI-PV: 958747
    SGI-Modid: xfs-linux-melb:xfs-kern:27792a

    Signed-off-by: Barry Naujok
    Signed-off-by: Russell Cattelan
    Signed-off-by: Tim Shimmin

    Barry Naujok
     
  • gcc-4.1 and more recent aggressively inline static functions which
    increases XFS stack usage by ~15% in critical paths. Prevent this from
    occurring by adding noinline to the STATIC definition.

    Also uninline some functions that are too large to be inlined and were
    causing problems with CONFIG_FORCED_INLINING=y.

    Finally, clean up all the different users of inline, __inline and
    __inline__ and put them under one STATIC_INLINE macro. For debug kernels
    the STATIC_INLINE macro uninlines those functions.

    SGI-PV: 957159
    SGI-Modid: xfs-linux-melb:xfs-kern:27585a

    Signed-off-by: David Chinner
    Signed-off-by: David Chatterton
    Signed-off-by: Tim Shimmin

    David Chinner