13 Jul, 2011

3 commits


08 Jul, 2011

1 commit

  • Rename xfs_buf_cond_lock and reverse it's return value to fit most other
    trylock operations in the Kernel and XFS (with the exception of down_trylock,
    after which xfs_buf_cond_lock was modelled), and replace xfs_buf_lock_val
    with an xfs_buf_islocked for use in asserts, or and opencoded variant in
    tracing. remove the XFS_BUF_* wrappers for all the locking helpers.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Alex Elder
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     

26 Mar, 2011

1 commit

  • When inside a transaction and we fail to read a buffer,
    xfs_trans_read_buf returns a null buffer pointer and no error.
    xfs_do_da_buf() checks the error return, but not the buffer, and as
    a result this read failure condition causes a panic when it attempts
    to dereference the non-existant buffer.

    Make xfs_trans_read_buf() return the same error for this situation
    regardless of whether it is in a transaction or not. This means
    every caller does not need to check both the error return and the
    buffer before proceeding to use the buffer.

    Signed-off-by: Dave Chinner
    Reviewed-by: Alex Elder

    Dave Chinner
     

07 Mar, 2011

1 commit


19 Oct, 2010

1 commit


27 Jul, 2010

4 commits

  • Stop the function pointer casting madness and give all the li_cb instances
    correct prototype.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     
  • Currently we track log item descriptor belonging to a transaction using a
    complex opencoded chunk allocator. This code has been there since day one
    and seems to work around the lack of an efficient slab allocator.

    This patch replaces it with dynamically allocated log item descriptors
    from a dedicated slab pool, linked to the transaction by a linked list.

    This allows to greatly simplify the log item descriptor tracking to the
    point where it's just a couple hundred lines in xfs_trans.c instead of
    a separate file. The external API has also been simplified while we're
    at it - the xfs_trans_add_item and xfs_trans_del_item functions to add/
    delete items from a transaction have been simplified to the bare minium,
    and the xfs_trans_find_item function is replaced with a direct dereference
    of the li_desc field. All debug code walking the list of log items in
    a transaction is down to a simple list_for_each_entry.

    Note that we could easily use a singly linked list here instead of the
    double linked list from list.h as the fastpath only does deletion from
    sequential traversal. But given that we don't have one available as
    a library function yet I use the list.h functions for simplicity.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     
  • Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     
  • Dmapi support was never merged upstream, but we still have a lot of hooks
    bloating XFS for it, all over the fast pathes of the filesystem.

    This patch drops over 700 lines of dmapi overhead. If we'll ever get HSM
    support in mainline at least the namespace events can be done much saner
    in the VFS instead of the individual filesystem, so it's not like this
    is much help for future work.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     

24 May, 2010

2 commits

  • With delayed logging, we can get inode allocation buffers in the
    same transaction inode unlink buffers. We don't currently mark inode
    allocation buffers in the log, so inode unlink buffers take
    precedence over allocation buffers.

    The result is that when they are combined into the same checkpoint,
    only the unlinked inode chain fields are replayed, resulting in
    uninitialised inode buffers being detected when the next inode
    modification is replayed.

    To fix this, we need to ensure that we do not set the inode buffer
    flag in the buffer log item format flags if the inode allocation has
    not already hit the log. To avoid requiring a change to log
    recovery, we really need to make this a modification that relies
    only on in-memory sate.

    We can do this by checking during buffer log formatting (while the
    CIL cannot be flushed) if we are still in the same sequence when we
    commit the unlink transaction as the inode allocation transaction.
    If we are, then we do not add the inode buffer flag to the buffer
    log format item flags. This means the entire buffer will be
    replayed, not just the unlinked fields. We do this while
    CIL flusheѕ are locked out to ensure that we don't race with the
    sequence numbers changing and hence fail to put the inode buffer
    flag in the buffer format flags when we really need to.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Dave Chinner
     
  • Clean up the buffer log format (XFS_BLI_*) flags because they have a
    polluted namespace. They XFS_BLI_ prefix is used for both in-memory
    and on-disk flag feilds, but have overlapping values for different
    flags. Rename the buffer log format flags to use the XFS_BLF_*
    prefix to avoid confusing them with the in-memory XFS_BLI_* prefixed
    flags.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Dave Chinner
     

19 May, 2010

2 commits

  • We currenly have a routine xfs_trans_buf_item_match_all which checks
    if any log item in a transaction contains a given buffer, and a
    second one that only does this check for the first, embedded chunk
    of log items. We only use the second routine if we know we only
    have that log item chunk, so get rid of the limited routine and
    always use the more complete one.

    Also rename the old xfs_trans_buf_item_match_all to
    xfs_trans_buf_item_match and update various surrounding comments,
    and move the remaining xfs_trans_buf_item_match on top of the file
    to avoid a forward prototype.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig
     
  • The staleness of a object being unpinned can be directly derived
    from the object itself - there is no need to extract it from the
    object then pass it as a parameter into IOP_UNPIN().

    This means we can kill the XFS_LID_BUF_STALE flag - it is set,
    checked and cleared in the same places XFS_BLI_STALE flag in the
    xfs_buf_log_item so it is now redundant and hence safe to remove.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     

02 Mar, 2010

1 commit


22 Jan, 2010

1 commit

  • Currently we define aliases for the buffer flags in various
    namespaces, which only adds confusion. Remove all but the XBF_
    flags to clean this up a bit.

    Note that we still abuse XFS_B_ASYNC/XBF_ASYNC for some non-buffer
    uses, but I'll clean that up later.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

15 Dec, 2009

1 commit

  • Convert the old xfs tracing support that could only be used with the
    out of tree kdb and xfsidbg patches to use the generic event tracer.

    To use it make sure CONFIG_EVENT_TRACING is enabled and then enable
    all xfs trace channels by:

    echo 1 > /sys/kernel/debug/tracing/events/xfs/enable

    or alternatively enable single events by just doing the same in one
    event subdirectory, e.g.

    echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_ihold/enable

    or set more complex filters, etc. In Documentation/trace/events.txt
    all this is desctribed in more detail. To reads the events do a

    cat /sys/kernel/debug/tracing/trace

    Compared to the last posting this patch converts the tracing mostly to
    the one tracepoint per callsite model that other users of the new
    tracing facility also employ. This allows a very fine-grained control
    of the tracing, a cleaner output of the traces and also enables the
    perf tool to use each tracepoint as a virtual performance counter,
    allowing us to e.g. count how often certain workloads git various
    spots in XFS. Take a look at

    http://lwn.net/Articles/346470/

    for some examples.

    Also the btree tracing isn't included at all yet, as it will require
    additional core tracing features not in mainline yet, I plan to
    deliver it later.

    And the really nice thing about this patch is that it actually removes
    many lines of code while adding this nice functionality:

    fs/xfs/Makefile | 8
    fs/xfs/linux-2.6/xfs_acl.c | 1
    fs/xfs/linux-2.6/xfs_aops.c | 52 -
    fs/xfs/linux-2.6/xfs_aops.h | 2
    fs/xfs/linux-2.6/xfs_buf.c | 117 +--
    fs/xfs/linux-2.6/xfs_buf.h | 33
    fs/xfs/linux-2.6/xfs_fs_subr.c | 3
    fs/xfs/linux-2.6/xfs_ioctl.c | 1
    fs/xfs/linux-2.6/xfs_ioctl32.c | 1
    fs/xfs/linux-2.6/xfs_iops.c | 1
    fs/xfs/linux-2.6/xfs_linux.h | 1
    fs/xfs/linux-2.6/xfs_lrw.c | 87 --
    fs/xfs/linux-2.6/xfs_lrw.h | 45 -
    fs/xfs/linux-2.6/xfs_super.c | 104 ---
    fs/xfs/linux-2.6/xfs_super.h | 7
    fs/xfs/linux-2.6/xfs_sync.c | 1
    fs/xfs/linux-2.6/xfs_trace.c | 75 ++
    fs/xfs/linux-2.6/xfs_trace.h | 1369 +++++++++++++++++++++++++++++++++++++++++
    fs/xfs/linux-2.6/xfs_vnode.h | 4
    fs/xfs/quota/xfs_dquot.c | 110 ---
    fs/xfs/quota/xfs_dquot.h | 21
    fs/xfs/quota/xfs_qm.c | 40 -
    fs/xfs/quota/xfs_qm_syscalls.c | 4
    fs/xfs/support/ktrace.c | 323 ---------
    fs/xfs/support/ktrace.h | 85 --
    fs/xfs/xfs.h | 16
    fs/xfs/xfs_ag.h | 14
    fs/xfs/xfs_alloc.c | 230 +-----
    fs/xfs/xfs_alloc.h | 27
    fs/xfs/xfs_alloc_btree.c | 1
    fs/xfs/xfs_attr.c | 107 ---
    fs/xfs/xfs_attr.h | 10
    fs/xfs/xfs_attr_leaf.c | 14
    fs/xfs/xfs_attr_sf.h | 40 -
    fs/xfs/xfs_bmap.c | 507 +++------------
    fs/xfs/xfs_bmap.h | 49 -
    fs/xfs/xfs_bmap_btree.c | 6
    fs/xfs/xfs_btree.c | 5
    fs/xfs/xfs_btree_trace.h | 17
    fs/xfs/xfs_buf_item.c | 87 --
    fs/xfs/xfs_buf_item.h | 20
    fs/xfs/xfs_da_btree.c | 3
    fs/xfs/xfs_da_btree.h | 7
    fs/xfs/xfs_dfrag.c | 2
    fs/xfs/xfs_dir2.c | 8
    fs/xfs/xfs_dir2_block.c | 20
    fs/xfs/xfs_dir2_leaf.c | 21
    fs/xfs/xfs_dir2_node.c | 27
    fs/xfs/xfs_dir2_sf.c | 26
    fs/xfs/xfs_dir2_trace.c | 216 ------
    fs/xfs/xfs_dir2_trace.h | 72 --
    fs/xfs/xfs_filestream.c | 8
    fs/xfs/xfs_fsops.c | 2
    fs/xfs/xfs_iget.c | 111 ---
    fs/xfs/xfs_inode.c | 67 --
    fs/xfs/xfs_inode.h | 76 --
    fs/xfs/xfs_inode_item.c | 5
    fs/xfs/xfs_iomap.c | 85 --
    fs/xfs/xfs_iomap.h | 8
    fs/xfs/xfs_log.c | 181 +----
    fs/xfs/xfs_log_priv.h | 20
    fs/xfs/xfs_log_recover.c | 1
    fs/xfs/xfs_mount.c | 2
    fs/xfs/xfs_quota.h | 8
    fs/xfs/xfs_rename.c | 1
    fs/xfs/xfs_rtalloc.c | 1
    fs/xfs/xfs_rw.c | 3
    fs/xfs/xfs_trans.h | 47 +
    fs/xfs/xfs_trans_buf.c | 62 -
    fs/xfs/xfs_vnodeops.c | 8
    70 files changed, 2151 insertions(+), 2592 deletions(-)

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

12 Dec, 2009

1 commit

  • Currently the low-level buffer cache interfaces are highly confusing
    as we have a _flags variant of each that does actually respect the
    flags, and one without _flags which has a flags argument that gets
    ignored and overriden with a default set. Given that very few places
    use the default arguments get rid of the duplication and convert all
    callers to pass the flags explicitly. Also remove the now confusing
    _flags postfix.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

01 Sep, 2009

1 commit

  • bp was tested for NULL a few lines before, followed by a return, and there
    is no intervening modification of its value.

    A simplified version of the semantic match that finds this problem is as
    follows: (http://www.emn.fr/x-info/coccinelle/)

    //
    @r exists@
    local idexpression x;
    expression E;
    position p1,p2;
    @@

    if (x == NULL || ...) { ... when forall
    return ...; }
    ... when != \(x=E\|x--\|x++\|--x\|++x\|x-=E\|x+=E\|x|=E\|x&=E\|&x\)
    (
    *x == NULL
    |
    *x != NULL
    )
    //

    Signed-off-by: Julia Lawall
    Acked-by: Felix Blyakher
    Signed-off-by: Felix Blyakher

    Julia Lawall
     

30 Oct, 2008

1 commit

  • Change all the remaining AIL API functions that are passed struct
    xfs_mount pointers to pass pointers directly to the struct xfs_ail being
    used. With this conversion, all external access to the AIL is via the
    struct xfs_ail. Hence the operation and referencing of the AIL is almost
    entirely independent of the xfs_mount that is using it - it is now much
    more tightly tied to the log and the items it is tracking in the log than
    it is tied to the xfs_mount.

    SGI-PV: 988143

    SGI-Modid: xfs-linux-melb:xfs-kern:32353a

    Signed-off-by: David Chinner
    Signed-off-by: Lachlan McIlroy
    Signed-off-by: Christoph Hellwig

    David Chinner
     

13 Aug, 2008

1 commit


18 Apr, 2008

2 commits

  • xfsbdstrat() is declared to return an error. That is never checked because
    the error is propagated by the xfs_buf_t that is passed through the
    function.

    Mark xfsbdstrat() as returning void and comment the prototype on the
    methods needed for error checking.

    SGI-PV: 980084
    SGI-Modid: xfs-linux-melb:xfs-kern:30823a

    Signed-off-by: David Chinner
    Signed-off-by: Niv Sardi
    Signed-off-by: Lachlan McIlroy

    David Chinner
     
  • When pdflush is writing back inodes, it can get stuck on inode cluster
    buffers that are currently under I/O. This occurs when we write data to
    multiple inodes in the same inode cluster at the same time.

    Effectively, delayed allocation marks the inode dirty during the data
    writeback. Hence if the inode cluster was flushed during the writeback of
    the first inode, the writeback of the second inode will block waiting for
    the inode cluster write to complete before writing it again for the newly
    dirtied inode.

    Basically, we want to avoid this from happening so we don't block pdflush
    and slow down all of writeback. Hence we introduce a non-blocking async
    inode flush flag that pdflush uses. If this flag is set, we use
    non-blocking operations (e.g. try locks) whereever we can to avoid
    blocking or extra I/O being issued.

    SGI-PV: 970925
    SGI-Modid: xfs-linux-melb:xfs-kern:30501a

    Signed-off-by: David Chinner
    Signed-off-by: Lachlan McIlroy

    David Chinner
     

01 Oct, 2007

1 commit

  • This reverts commit b394e43e995d08821588a22561c6a71a63b4ff27.

    Lachlan McIlroy says:
    It tried to fix an issue where log replay is replaying an inode cluster
    initialisation transaction that should not be replayed because the inode
    cluster on disk is more up to date. Since we don't log file sizes (we
    rely on inode flushing to get them to disk) then we can't just replay
    all the transations in the log and expect the inode to be completely
    restored. We lose file size updates. Unfortunately this fix is causing
    more (serious) problems than it is fixing.

    SGI-PV: 969656
    SGI-Modid: xfs-linux-melb:xfs-kern:29804a

    Signed-off-by: Lachlan McIlroy
    Signed-off-by: Tim Shimmin

    Tim Shimmin
     

18 Sep, 2007

1 commit


20 Jun, 2006

1 commit


09 Jun, 2006

2 commits


02 Nov, 2005

2 commits


05 Sep, 2005

1 commit


21 Jun, 2005

1 commit


17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds