24 Aug, 2012

1 commit


30 Jul, 2012

5 commits


23 Jul, 2012

1 commit


15 May, 2012

2 commits

  • With the removal of xfs_rw.h and other changes over time, xfs_bit.h
    is being included in many files that don't actually need it. Clean
    up the includes as necessary.

    Also move the only-used-once xfs_ialloc_find_free() static inline
    function out of a header file that is widely included to reduce
    the number of needless dependencies on xfs_bit.h.

    Signed-off-by: Dave Chinner
    Reviewed-by: Mark Tinguely
    Signed-off-by: Ben Myers

    Dave Chinner
     
  • Buffers are always returned locked from the lookup routines. Hence
    we don't need to tell the lookup routines to return locked buffers,
    on to try and lock them. Remove XBF_LOCK from all the callers and
    from internal buffer cache usage.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Mark Tinguely
    Signed-off-by: Ben Myers

    Dave Chinner
     

04 Jan, 2012

1 commit


12 Oct, 2011

2 commits


26 Jul, 2011

1 commit


08 Jul, 2011

1 commit


07 Mar, 2011

2 commits


19 Oct, 2010

1 commit


24 Aug, 2010

1 commit

  • Commit 7124fe0a5b619d65b739477b3b55a20bf805b06d ("xfs: validate untrusted inode
    numbers during lookup") changes the inode lookup code to do btree lookups for
    untrusted inode numbers. This change made an invalid assumption about the
    alignment of inodes and hence incorrectly calculated the first inode in the
    cluster. As a result, some inode numbers were being incorrectly considered
    invalid when they were actually valid.

    The issue was not picked up by the xfstests suite because it always runs fsr
    and dump (the two utilities that utilise the bulkstat interface) on cache hot
    inodes and hence the lookup code in the cold cache path was not sufficiently
    exercised to uncover this intermittent problem.

    Fix the issue by relaxing the btree lookup criteria and then checking if the
    record returned contains the inode number we are lookup for. If it we get an
    incorrect record, then the inode number is invalid.

    Cc:
    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     

27 Jul, 2010

2 commits

  • Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     
  • Dmapi support was never merged upstream, but we still have a lot of hooks
    bloating XFS for it, all over the fast pathes of the filesystem.

    This patch drops over 700 lines of dmapi overhead. If we'll ever get HSM
    support in mainline at least the namespace events can be done much saner
    in the VFS instead of the individual filesystem, so it's not like this
    is much help for future work.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     

24 Jun, 2010

3 commits

  • The block number comes from bulkstat based inode lookups to shortcut
    the mapping calculations. We ar enot able to trust anything from
    bulkstat, so drop the block number as well so that the correct
    lookups and mappings are always done.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     
  • Inode numbers may come from somewhere external to the filesystem
    (e.g. file handles, bulkstat information) and so are inherently
    untrusted. Rename the flag we use for these lookups to make it
    obvious we are doing a lookup of an untrusted inode number and need
    to verify it completely before trying to read it from disk.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     
  • When we decode a handle or do a bulkstat lookup, we are using an
    inode number we cannot trust to be valid. If we are deleting inode
    chunks from disk (default noikeep mode), then we cannot trust the on
    disk inode buffer for any given inode number to correctly reflect
    whether the inode has been unlinked as the di_mode nor the
    generation number may have been updated on disk.

    This is due to the fact that when we delete an inode chunk, we do
    not write the clusters back to disk when they are removed - instead
    we mark them stale to avoid them being written back potentially over
    the top of something that has been subsequently allocated at that
    location. The result is that we can have locations of disk that look
    like they contain valid inodes but in reality do not. Hence we
    cannot simply convert the inode number to a block number and read
    the location from disk to determine if the inode is valid or not.

    As a result, and XFS_IGET_BULKSTAT lookup needs to actually look the
    inode up in the inode allocation btree to determine if the inode
    number is valid or not.

    It should be noted even on ikeep filesystems, there is the
    possibility that blocks on disk may look like valid inode clusters.
    e.g. if there are filesystem images hosted on the filesystem. Hence
    even for ikeep filesystems we really need to validate that the inode
    number is valid before issuing the inode buffer read.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     

22 Jan, 2010

1 commit

  • Currently we define aliases for the buffer flags in various
    namespaces, which only adds confusion. Remove all but the XBF_
    flags to clean this up a bit.

    Note that we still abuse XFS_B_ASYNC/XBF_ASYNC for some non-buffer
    uses, but I'll clean that up later.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

16 Jan, 2010

2 commits

  • The use of an array for the per-ag structures requires reallocation
    of the array when growing the filesystem. This requires locking
    access to the array to avoid use after free situations, and the
    locking is difficult to get right. To avoid needing to reallocate an
    array, change the per-ag structures to an allocated object per ag
    and index them using a tree structure.

    The AGs are always densely indexed (hence the use of an array), but
    the number supported is 2^32 and lookups tend to be random and hence
    indexing needs to scale. A simple choice is a radix tree - it works
    well with this sort of index. This change also removes another
    large contiguous allocation from the mount/growfs path in XFS.

    The growing process now needs to change to only initialise the new
    AGs required for the extra space, and as such only needs to
    exclusively lock the tree for inserts. The rest of the code only
    needs to lock the tree while doing lookups, and hence this will
    remove all the deadlocks that currently occur on the m_perag_lock as
    it is now an innermost lock. The lock is also changed to a spinlock
    from a read/write lock as the hold time is now extremely short.

    To complete the picture, the per-ag structures will need to be
    reference counted to ensure that we don't free/modify them while
    they are still in use. This will be done in subsequent patch.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Dave Chinner
     
  • Convert the remaining direct lookups of the per ag structures to use
    get/put accesses. Ensure that the loops across AGs and prior users
    of the interface balance gets and puts correctly.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Dave Chinner
     

12 Dec, 2009

1 commit

  • Remove our own STATIC_INLINE macro. For small function inside
    implementation files just use STATIC and let gcc inline it, and for
    those in headers do the normal static inline - they are all small
    enough to be inlined for debug builds, too.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

30 Oct, 2009

1 commit

  • Commit bd169565993b39b9b4b102cdac8b13e0a259ce2f seems
    to have a slight regression where this code path:

    if (!--searchdistance) {
    /*
    * Not in range - save last search
    * location and allocate a new inode
    */
    ...
    goto newino;
    }

    doesn't free the temporary cursor (tcur) that got dup'd in
    this function.

    This leaks an item in the xfs_btree_cur zone, and it's caught
    on module unload:

    ===========================================================
    BUG xfs_btree_cur: Objects remaining on kmem_cache_close()
    -----------------------------------------------------------

    It seems like maybe a single free at the end of the function might
    be cleaner, but for now put a del_cursor right in this code block
    similar to the handling in the rest of the function.

    Signed-off-by: Eric Sandeen
    Signed-off-by: Christoph Hellwig

    Eric Sandeen
     

02 Sep, 2009

8 commits

  • xfs_inobt_lookup is also used in xfs_itable.c, remove the STATIC modifier
    from it's declaration to fix non-debug builds.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Felix Blyakher
    Signed-off-by: Felix Blyakher

    Christoph Hellwig
     
  • Don't search too far - abort if it is outside a certain radius and simply do
    a linear search for the first free inode. In AGs with a million inodes this
    can speed up allocation speed by 3-4x.

    [hch: ported to the new xfs_ialloc.c world order]

    Signed-off-by: Dave Chinner
    Signed-off-by: Christoph Hellwig
    Reviewed-by: Alex Elder
    Signed-off-by: Felix Blyakher

    Dave Chinner
     
  • Currenly we have a xfs_inobt_lookup* variant for each comparism direction,
    and all these get all three fields of the inobt records passed, while the
    common case is just looking for the inode number and we have only marginally
    more callers than xfs_inobt_lookup* variants.

    So opencode a direct call to xfs_btree_lookup for the single case where we
    need all fields, and replace xfs_inobt_lookup* with a xfs_inobt_looku that
    just takes the inode number and the direction for all other callers.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Alex Elder
    Signed-off-by: Felix Blyakher

    Christoph Hellwig
     
  • Clarify the control flow in xfs_dialloc. Factor out a helper to go to the
    next node from the current one and improve the control flow by expanding
    composite if statements and using gotos.

    The xfs_ialloc_next_rec helper is borrowed from Dave Chinners dynamic
    allocation policy patches.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Alex Elder
    Signed-off-by: Felix Blyakher

    Christoph Hellwig
     
  • Factor out a common helper from repeated debug checks in xfs_dialloc and
    xfs_difree.

    [hch: split out from Dave's dynamic allocation policy patches]

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Alex Elder
    Signed-off-by: Felix Blyakher

    Dave Chinner
     
  • Both callers of xfs_inobt_update have the record in form of a
    xfs_inobt_rec_incore_t, so just pass a pointer to it instead of the
    individual variables.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Alex Elder
    Signed-off-by: Felix Blyakher

    Christoph Hellwig
     
  • Most callers of xfs_inobt_get_rec need to fill a xfs_inobt_rec_incore_t, and
    those who don't yet are fine with a xfs_inobt_rec_incore_t, instead of the
    three individual variables, too. So just change xfs_inobt_get_rec to write
    the output into a xfs_inobt_rec_incore_t directly.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Alex Elder
    Signed-off-by: Felix Blyakher

    Christoph Hellwig
     
  • Factor out code to initialize new inode clusters into a function of it's own.
    This keeps xfs_ialloc_ag_alloc smaller and better structured and enables a
    future inode cluster initialization transaction. Also initialize the agno
    variable earlier in xfs_ialloc_ag_alloc to avoid repeated byte swaps.

    [hch: The original patch is from Dave from his unpublished inode create
    transaction patch series, with some modifcations by me to apply stand-alone]

    Signed-off-by: Dave Chinner
    Signed-off-by: Christoph Hellwig
    Reviewed-by: Alex Elder
    Signed-off-by: Felix Blyakher

    Dave Chinner
     

29 Mar, 2009

1 commit


09 Feb, 2009

1 commit

  • xfs_ialloc_btree.h has a a cuple of macros that only obsfucate the code
    but don't provide any abstraction benefits. This patches removes those
    and cleans up the reamaining defintions up a little.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     

16 Jan, 2009

1 commit


01 Dec, 2008

1 commit