12 Oct, 2011

2 commits


27 Jul, 2011

2 commits


11 Nov, 2010

1 commit

  • The filestreams code may take the iolock on the parent inode while
    holding it on a child. This is the only place in XFS where we take
    both the child and parent iolock, so just telling lockdep about it
    is enough. The lock flag required for that was already added as
    part of the ilock lockdep annotations and unused so far.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

27 Jul, 2010

3 commits

  • Move xfs_filestream_peek_ag, xxfs_filestream_get_ag and xfs_filestream_put_ag
    from xfs_filestream.h to xfs_filestream.c where it's only callers are, and
    remove the inline marker while we're at it to let the compiler decide on the
    inlining. Also don't return a value from xfs_filestream_put_ag because
    we don't need it.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Christoph Hellwig
     
  • Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     
  • Dmapi support was never merged upstream, but we still have a lot of hooks
    bloating XFS for it, all over the fast pathes of the filesystem.

    This patch drops over 700 lines of dmapi overhead. If we'll ever get HSM
    support in mainline at least the namespace events can be done much saner
    in the VFS instead of the individual filesystem, so it's not like this
    is much help for future work.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     

16 Jan, 2010

3 commits

  • The filestreams cache flush is not needed in the sync code as it
    does not affect data writeback, and it is now not used by the growfs
    code, either, so kill it.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Dave Chinner
     
  • The use of an array for the per-ag structures requires reallocation
    of the array when growing the filesystem. This requires locking
    access to the array to avoid use after free situations, and the
    locking is difficult to get right. To avoid needing to reallocate an
    array, change the per-ag structures to an allocated object per ag
    and index them using a tree structure.

    The AGs are always densely indexed (hence the use of an array), but
    the number supported is 2^32 and lookups tend to be random and hence
    indexing needs to scale. A simple choice is a radix tree - it works
    well with this sort of index. This change also removes another
    large contiguous allocation from the mount/growfs path in XFS.

    The growing process now needs to change to only initialise the new
    AGs required for the extra space, and as such only needs to
    exclusively lock the tree for inserts. The rest of the code only
    needs to lock the tree while doing lookups, and hence this will
    remove all the deadlocks that currently occur on the m_perag_lock as
    it is now an innermost lock. The lock is also changed to a spinlock
    from a read/write lock as the hold time is now extremely short.

    To complete the picture, the per-ag structures will need to be
    reference counted to ensure that we don't free/modify them while
    they are still in use. This will be done in subsequent patch.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Dave Chinner
     
  • Use xfs_perag_get() and xfs_perag_put() in the filestreams code.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Dave Chinner
     

15 Dec, 2009

1 commit

  • Convert the old xfs tracing support that could only be used with the
    out of tree kdb and xfsidbg patches to use the generic event tracer.

    To use it make sure CONFIG_EVENT_TRACING is enabled and then enable
    all xfs trace channels by:

    echo 1 > /sys/kernel/debug/tracing/events/xfs/enable

    or alternatively enable single events by just doing the same in one
    event subdirectory, e.g.

    echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_ihold/enable

    or set more complex filters, etc. In Documentation/trace/events.txt
    all this is desctribed in more detail. To reads the events do a

    cat /sys/kernel/debug/tracing/trace

    Compared to the last posting this patch converts the tracing mostly to
    the one tracepoint per callsite model that other users of the new
    tracing facility also employ. This allows a very fine-grained control
    of the tracing, a cleaner output of the traces and also enables the
    perf tool to use each tracepoint as a virtual performance counter,
    allowing us to e.g. count how often certain workloads git various
    spots in XFS. Take a look at

    http://lwn.net/Articles/346470/

    for some examples.

    Also the btree tracing isn't included at all yet, as it will require
    additional core tracing features not in mainline yet, I plan to
    deliver it later.

    And the really nice thing about this patch is that it actually removes
    many lines of code while adding this nice functionality:

    fs/xfs/Makefile | 8
    fs/xfs/linux-2.6/xfs_acl.c | 1
    fs/xfs/linux-2.6/xfs_aops.c | 52 -
    fs/xfs/linux-2.6/xfs_aops.h | 2
    fs/xfs/linux-2.6/xfs_buf.c | 117 +--
    fs/xfs/linux-2.6/xfs_buf.h | 33
    fs/xfs/linux-2.6/xfs_fs_subr.c | 3
    fs/xfs/linux-2.6/xfs_ioctl.c | 1
    fs/xfs/linux-2.6/xfs_ioctl32.c | 1
    fs/xfs/linux-2.6/xfs_iops.c | 1
    fs/xfs/linux-2.6/xfs_linux.h | 1
    fs/xfs/linux-2.6/xfs_lrw.c | 87 --
    fs/xfs/linux-2.6/xfs_lrw.h | 45 -
    fs/xfs/linux-2.6/xfs_super.c | 104 ---
    fs/xfs/linux-2.6/xfs_super.h | 7
    fs/xfs/linux-2.6/xfs_sync.c | 1
    fs/xfs/linux-2.6/xfs_trace.c | 75 ++
    fs/xfs/linux-2.6/xfs_trace.h | 1369 +++++++++++++++++++++++++++++++++++++++++
    fs/xfs/linux-2.6/xfs_vnode.h | 4
    fs/xfs/quota/xfs_dquot.c | 110 ---
    fs/xfs/quota/xfs_dquot.h | 21
    fs/xfs/quota/xfs_qm.c | 40 -
    fs/xfs/quota/xfs_qm_syscalls.c | 4
    fs/xfs/support/ktrace.c | 323 ---------
    fs/xfs/support/ktrace.h | 85 --
    fs/xfs/xfs.h | 16
    fs/xfs/xfs_ag.h | 14
    fs/xfs/xfs_alloc.c | 230 +-----
    fs/xfs/xfs_alloc.h | 27
    fs/xfs/xfs_alloc_btree.c | 1
    fs/xfs/xfs_attr.c | 107 ---
    fs/xfs/xfs_attr.h | 10
    fs/xfs/xfs_attr_leaf.c | 14
    fs/xfs/xfs_attr_sf.h | 40 -
    fs/xfs/xfs_bmap.c | 507 +++------------
    fs/xfs/xfs_bmap.h | 49 -
    fs/xfs/xfs_bmap_btree.c | 6
    fs/xfs/xfs_btree.c | 5
    fs/xfs/xfs_btree_trace.h | 17
    fs/xfs/xfs_buf_item.c | 87 --
    fs/xfs/xfs_buf_item.h | 20
    fs/xfs/xfs_da_btree.c | 3
    fs/xfs/xfs_da_btree.h | 7
    fs/xfs/xfs_dfrag.c | 2
    fs/xfs/xfs_dir2.c | 8
    fs/xfs/xfs_dir2_block.c | 20
    fs/xfs/xfs_dir2_leaf.c | 21
    fs/xfs/xfs_dir2_node.c | 27
    fs/xfs/xfs_dir2_sf.c | 26
    fs/xfs/xfs_dir2_trace.c | 216 ------
    fs/xfs/xfs_dir2_trace.h | 72 --
    fs/xfs/xfs_filestream.c | 8
    fs/xfs/xfs_fsops.c | 2
    fs/xfs/xfs_iget.c | 111 ---
    fs/xfs/xfs_inode.c | 67 --
    fs/xfs/xfs_inode.h | 76 --
    fs/xfs/xfs_inode_item.c | 5
    fs/xfs/xfs_iomap.c | 85 --
    fs/xfs/xfs_iomap.h | 8
    fs/xfs/xfs_log.c | 181 +----
    fs/xfs/xfs_log_priv.h | 20
    fs/xfs/xfs_log_recover.c | 1
    fs/xfs/xfs_mount.c | 2
    fs/xfs/xfs_quota.h | 8
    fs/xfs/xfs_rename.c | 1
    fs/xfs/xfs_rtalloc.c | 1
    fs/xfs/xfs_rw.c | 3
    fs/xfs/xfs_trans.h | 47 +
    fs/xfs/xfs_trans_buf.c | 62 -
    fs/xfs/xfs_vnodeops.c | 8
    70 files changed, 2151 insertions(+), 2592 deletions(-)

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

08 Jun, 2009

1 commit

  • xfs_sync_inodes is used to write back either file data or inode metadata.
    In general we always do these separately, except for one fishy case in
    xfs_fs_put_super that does both. So separate xfs_sync_inodes into
    separate xfs_sync_data and xfs_sync_attr functions. In xfs_fs_put_super
    we first call the data sync and then the attr sync as that was the previous
    order. The moved log force in that path doesn't make a difference because
    we will force the log again as part of the real unmount process.

    The filesystem readonly checks are not performed by the new function but
    instead moved into the callers, given that most callers alredy have it
    further up in the stack. Also add debug checks that we do not pass in
    incorrect flags in the new xfs_sync_data and xfs_sync_attr function and
    fix the one place that did pass in a wrong flag.

    Also remove a comment mentioning xfs_sync_inodes that has been incorrect
    for a while because we always take either the iolock or ilock in the
    sync path these days.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Eric Sandeen

    Christoph Hellwig
     

16 Mar, 2009

1 commit


13 Aug, 2008

1 commit

  • Use KM_NOFS to prevent recursion back into the filesystem which can cause
    deadlocks.

    In the case of xfs_iread() we hold the lock on the inode cluster buffer
    while allocating memory for the trace buffers. If we recurse back into XFS
    to flush data that may require a transaction to allocate extents which
    needs log space. This can deadlock with the xfsaild thread which can't
    push the tail of the log because it is trying to get the inode cluster
    buffer lock.

    SGI-PV: 981498

    SGI-Modid: xfs-linux-melb:xfs-kern:31838a

    Signed-off-by: Lachlan McIlroy
    Signed-off-by: David Chinner

    Lachlan McIlroy
     

28 Jul, 2008

1 commit

  • Currently the xfs module init/exit code is a mess. It's farmed out over a
    lot of function with very little error checking. This patch makes sure we
    propagate all initialization failures properly and clean up after them.
    Various runtime initializations are replaced with compile-time
    initializations where possible to make this easier. The exit path is
    similarly consolidated.

    There's now split out function to create/destroy the kmem zones and
    alloc/free the trace buffers. I've also changed the ktrace allocations to
    KM_MAYFAIL and handled errors resulting from that.

    And yes, we really should replace the XFS_*_TRACE ifdefs with a single
    XFS_TRACE..

    SGI-PV: 976035

    SGI-Modid: xfs-linux-melb:xfs-kern:31354a

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Niv Sardi
    Signed-off-by: Lachlan McIlroy

    Christoph Hellwig
     

18 Apr, 2008

1 commit


07 Feb, 2008

1 commit

  • These are mostly locking annotations, marking things static, casts where
    needed and declaring stuff in header files.

    SGI-PV: 971186
    SGI-Modid: xfs-linux-melb:xfs-kern:30002a

    Signed-off-by: David Chinner
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Lachlan McIlroy

    David Chinner
     

20 Sep, 2007

1 commit

  • xfs_filestream_mount() sets up an mru cache with:
    err = xfs_mru_cache_create(&mp->m_filestream, lifetime, grp_count,
    (xfs_mru_cache_free_func_t)xfs_fstrm_free_func);
    but that cast is causing problems...
    typedef void (*xfs_mru_cache_free_func_t)(unsigned long, void*);
    but:
    void xfs_fstrm_free_func( xfs_ino_t ino, fstrm_item_t *item)
    so on a 32-bit box, it's casting (32, 32) args into (64, 32) and I assume
    it's getting garbage for *item, which subsequently causes an explosion.
    With this change the filestreams xfsqa tests don't oops on my 32-bit box.

    SGI-PV: 967795
    SGI-Modid: xfs-linux-melb:xfs-kern:29510a

    Signed-off-by: Eric Sandeen
    Signed-off-by: David Chinner
    Signed-off-by: Tim Shimmin

    Eric Sandeen
     

17 Sep, 2007

1 commit

  • Instead of running the mru cache reaper all the time based on a timeout,
    we should only run it when the cache has active objects. This allows CPUs
    to sleep when there is no activity rather than be woken repeatedly just to
    check if there is anything to do.

    SGI-PV: 968554
    SGI-Modid: xfs-linux-melb:xfs-kern:29305a

    Signed-off-by: David Chinner
    Signed-off-by: Donald Douwsma
    Signed-off-by: Tim Shimmin

    David Chinner
     

14 Jul, 2007

1 commit

  • In media spaces, video is often stored in a frame-per-file format. When
    dealing with uncompressed realtime HD video streams in this format, it is
    crucial that files do not get fragmented and that multiple files a placed
    contiguously on disk.

    When multiple streams are being ingested and played out at the same time,
    it is critical that the filesystem does not cross the streams and
    interleave them together as this creates seek and readahead cache miss
    latency and prevents both ingest and playout from meeting frame rate
    targets.

    This patch set creates a "stream of files" concept into the allocator to
    place all the data from a single stream contiguously on disk so that RAID
    array readahead can be used effectively. Each additional stream gets
    placed in different allocation groups within the filesystem, thereby
    ensuring that we don't cross any streams. When an AG fills up, we select a
    new AG for the stream that is not in use.

    The core of the functionality is the stream tracking - each inode that we
    create in a directory needs to be associated with the directories' stream.
    Hence every time we create a file, we look up the directories' stream
    object and associate the new file with that object.

    Once we have a stream object for a file, we use the AG that the stream
    object point to for allocations. If we can't allocate in that AG (e.g. it
    is full) we move the entire stream to another AG. Other inodes in the same
    stream are moved to the new AG on their next allocation (i.e. lazy
    update).

    Stream objects are kept in a cache and hold a reference on the inode.
    Hence the inode cannot be reclaimed while there is an outstanding stream
    reference. This means that on unlink we need to remove the stream
    association and we also need to flush all the associations on certain
    events that want to reclaim all unreferenced inodes (e.g. filesystem
    freeze).

    SGI-PV: 964469
    SGI-Modid: xfs-linux-melb:xfs-kern:29096a

    Signed-off-by: David Chinner
    Signed-off-by: Barry Naujok
    Signed-off-by: Donald Douwsma
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Tim Shimmin
    Signed-off-by: Vlad Apostolov

    David Chinner