12 Oct, 2011

1 commit

  • There is no reason to keep a reference to the inode even if we unlock
    it during transaction commit because we never drop a reference between
    the ijoin and commit. Also use this fact to merge xfs_trans_ijoin_ref
    back into xfs_trans_ijoin - the third argument decides if an unlock
    is needed now.

    I'm actually starting to wonder if allowing inodes to be unlocked
    at transaction commit really is worth the effort. The only real
    benefit is that they can be unlocked earlier when commiting a
    synchronous transactions, but that could be solved by doing the
    log force manually after the unlock, too.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

27 Jul, 2011

1 commit


10 Dec, 2010

1 commit

  • Now that we don't mark VFS inodes dirty anymore for internal
    timestamp changes, but rely on the transaction subsystem to push
    them out, we need to explicitly log the source inode in rename after
    updating it's timestamps to make sure the changes actually get
    forced out by sync/fsync or an AIL push.

    We already account for the fourth inode in the log reservation, as a
    rename of directories needs to update the nlink field, so just
    adding the xfs_trans_log_inode call is enough.

    This fixes the xfsqa 065 regression introduced by:

    "xfs: don't use vfs writeback for pure metadata modifications"

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

19 Oct, 2010

2 commits

  • This patch adds support for 32bit project quota identifiers.

    On disk format is backward compatible with 16bit projid numbers. projid
    on disk is now kept in two 16bit values - di_projid_lo (which holds the
    same position as old 16bit projid value) and new di_projid_hi (takes
    existing padding) and converts from/to 32bit value on the fly.

    xfs_admin (for existing fs), mkfs.xfs (for new fs) needs to be used
    to enable PROJID32BIT support.

    Signed-off-by: Arkadiusz Miśkiewicz
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Arkadiusz Mi?kiewicz
     
  • Under heavy multi-way parallel create workloads, the VFS struggles
    to write back all the inodes that have been changed in age order.
    The bdi flusher thread becomes CPU bound, spending 85% of it's time
    in the VFS code, mostly traversing the superblock dirty inode list
    to separate dirty inodes old enough to flush.

    We already keep an index of all metadata changes in age order - in
    the AIL - and continued log pressure will do age ordered writeback
    without any extra overhead at all. If there is no pressure on the
    log, the xfssyncd will periodically write back metadata in ascending
    disk address offset order so will be very efficient.

    Hence we can stop marking VFS inodes dirty during transaction commit
    or when changing timestamps during transactions. This will keep the
    inodes in the superblock dirty list to those containing data or
    unlogged metadata changes.

    However, the timstamp changes are slightly more complex than this -
    there are a couple of places that do unlogged updates of the
    timestamps, and the VFS need to be informed of these. Hence add a
    new function xfs_trans_ichgtime() for transactional changes,
    and leave xfs_ichgtime() for the non-transactional changes.

    Signed-off-by: Dave Chinner
    Reviewed-by: Alex Elder
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     

27 Jul, 2010

4 commits

  • Replace the xfs_itrace_entry catchall with specific trace points. For
    most simple callers we now use the simple inode class, which used to
    be the iget class, but add more details tracing for namespace events,
    which now includes the name of the directory entries manipulated.

    Remove the xfs_inactive trace point, which is a duplicate of the clear_inode
    one, and the xfs_change_file_space trace point, which is immediately
    followed by the more specific alloc/free space trace points.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     
  • Currently we need to either call IHOLD or xfs_trans_ihold on an inode when
    joining it to a transaction via xfs_trans_ijoin.

    This patches instead makes xfs_trans_ijoin usable on it's own by doing
    an implicity xfs_trans_ihold, which also allows us to drop the third
    argument. For the case where we want to hold a reference on the inode
    a xfs_trans_ijoin_ref wrapper is added which does the IHOLD and marks
    the inode for needing an xfs_iput. In addition to the cleaner interface
    to the caller this also simplifies the implementation.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     
  • Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     
  • Dmapi support was never merged upstream, but we still have a lot of hooks
    bloating XFS for it, all over the fast pathes of the filesystem.

    This patch drops over 700 lines of dmapi overhead. If we'll ever get HSM
    support in mainline at least the namespace events can be done much saner
    in the VFS instead of the individual filesystem, so it's not like this
    is much help for future work.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     

15 Dec, 2009

1 commit

  • Convert the old xfs tracing support that could only be used with the
    out of tree kdb and xfsidbg patches to use the generic event tracer.

    To use it make sure CONFIG_EVENT_TRACING is enabled and then enable
    all xfs trace channels by:

    echo 1 > /sys/kernel/debug/tracing/events/xfs/enable

    or alternatively enable single events by just doing the same in one
    event subdirectory, e.g.

    echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_ihold/enable

    or set more complex filters, etc. In Documentation/trace/events.txt
    all this is desctribed in more detail. To reads the events do a

    cat /sys/kernel/debug/tracing/trace

    Compared to the last posting this patch converts the tracing mostly to
    the one tracepoint per callsite model that other users of the new
    tracing facility also employ. This allows a very fine-grained control
    of the tracing, a cleaner output of the traces and also enables the
    perf tool to use each tracepoint as a virtual performance counter,
    allowing us to e.g. count how often certain workloads git various
    spots in XFS. Take a look at

    http://lwn.net/Articles/346470/

    for some examples.

    Also the btree tracing isn't included at all yet, as it will require
    additional core tracing features not in mainline yet, I plan to
    deliver it later.

    And the really nice thing about this patch is that it actually removes
    many lines of code while adding this nice functionality:

    fs/xfs/Makefile | 8
    fs/xfs/linux-2.6/xfs_acl.c | 1
    fs/xfs/linux-2.6/xfs_aops.c | 52 -
    fs/xfs/linux-2.6/xfs_aops.h | 2
    fs/xfs/linux-2.6/xfs_buf.c | 117 +--
    fs/xfs/linux-2.6/xfs_buf.h | 33
    fs/xfs/linux-2.6/xfs_fs_subr.c | 3
    fs/xfs/linux-2.6/xfs_ioctl.c | 1
    fs/xfs/linux-2.6/xfs_ioctl32.c | 1
    fs/xfs/linux-2.6/xfs_iops.c | 1
    fs/xfs/linux-2.6/xfs_linux.h | 1
    fs/xfs/linux-2.6/xfs_lrw.c | 87 --
    fs/xfs/linux-2.6/xfs_lrw.h | 45 -
    fs/xfs/linux-2.6/xfs_super.c | 104 ---
    fs/xfs/linux-2.6/xfs_super.h | 7
    fs/xfs/linux-2.6/xfs_sync.c | 1
    fs/xfs/linux-2.6/xfs_trace.c | 75 ++
    fs/xfs/linux-2.6/xfs_trace.h | 1369 +++++++++++++++++++++++++++++++++++++++++
    fs/xfs/linux-2.6/xfs_vnode.h | 4
    fs/xfs/quota/xfs_dquot.c | 110 ---
    fs/xfs/quota/xfs_dquot.h | 21
    fs/xfs/quota/xfs_qm.c | 40 -
    fs/xfs/quota/xfs_qm_syscalls.c | 4
    fs/xfs/support/ktrace.c | 323 ---------
    fs/xfs/support/ktrace.h | 85 --
    fs/xfs/xfs.h | 16
    fs/xfs/xfs_ag.h | 14
    fs/xfs/xfs_alloc.c | 230 +-----
    fs/xfs/xfs_alloc.h | 27
    fs/xfs/xfs_alloc_btree.c | 1
    fs/xfs/xfs_attr.c | 107 ---
    fs/xfs/xfs_attr.h | 10
    fs/xfs/xfs_attr_leaf.c | 14
    fs/xfs/xfs_attr_sf.h | 40 -
    fs/xfs/xfs_bmap.c | 507 +++------------
    fs/xfs/xfs_bmap.h | 49 -
    fs/xfs/xfs_bmap_btree.c | 6
    fs/xfs/xfs_btree.c | 5
    fs/xfs/xfs_btree_trace.h | 17
    fs/xfs/xfs_buf_item.c | 87 --
    fs/xfs/xfs_buf_item.h | 20
    fs/xfs/xfs_da_btree.c | 3
    fs/xfs/xfs_da_btree.h | 7
    fs/xfs/xfs_dfrag.c | 2
    fs/xfs/xfs_dir2.c | 8
    fs/xfs/xfs_dir2_block.c | 20
    fs/xfs/xfs_dir2_leaf.c | 21
    fs/xfs/xfs_dir2_node.c | 27
    fs/xfs/xfs_dir2_sf.c | 26
    fs/xfs/xfs_dir2_trace.c | 216 ------
    fs/xfs/xfs_dir2_trace.h | 72 --
    fs/xfs/xfs_filestream.c | 8
    fs/xfs/xfs_fsops.c | 2
    fs/xfs/xfs_iget.c | 111 ---
    fs/xfs/xfs_inode.c | 67 --
    fs/xfs/xfs_inode.h | 76 --
    fs/xfs/xfs_inode_item.c | 5
    fs/xfs/xfs_iomap.c | 85 --
    fs/xfs/xfs_iomap.h | 8
    fs/xfs/xfs_log.c | 181 +----
    fs/xfs/xfs_log_priv.h | 20
    fs/xfs/xfs_log_recover.c | 1
    fs/xfs/xfs_mount.c | 2
    fs/xfs/xfs_quota.h | 8
    fs/xfs/xfs_rename.c | 1
    fs/xfs/xfs_rtalloc.c | 1
    fs/xfs/xfs_rw.c | 3
    fs/xfs/xfs_trans.h | 47 +
    fs/xfs/xfs_trans_buf.c | 62 -
    fs/xfs/xfs_vnodeops.c | 8
    70 files changed, 2151 insertions(+), 2592 deletions(-)

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

08 Jun, 2009

1 commit

  • Kill the quota ops function vector and replace it with direct calls or
    stubs in the CONFIG_XFS_QUOTA=n case.

    Make sure we check XFS_IS_QUOTA_RUNNING in the right spots. We can remove
    the number of those checks because the XFS_TRANS_DQ_DIRTY flag can't be set
    otherwise.

    This brings us back closer to the way this code worked in IRIX and earlier
    Linux versions, but we keep a lot of the more useful factoring of common
    code.

    Eventually we should also kill xfs_qm_bhv.c, but that's left for a later
    patch.

    Reduces the size of the source code by about 250 lines and the size of
    XFS module by about 1.5 kilobytes with quotas enabled:

    text data bss dec hex filename
    615957 2960 3848 622765 980ad fs/xfs/xfs.o
    617231 3152 3848 624231 98667 fs/xfs/xfs.o.old

    Fallout:

    - xfs_qm_dqattach is split into xfs_qm_dqattach_locked which expects
    the inode locked and xfs_qm_dqattach which does the locking around it,
    thus removing XFS_QMOPT_ILOCKED.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Eric Sandeen

    Christoph Hellwig
     

16 Jan, 2009

1 commit


11 Dec, 2008

1 commit

  • Check for the project ID after attaching all inodes to the transaction.
    That way the unlock in the error case is done by the transaction subsystem,
    which guaratees that is uses the right flags (which was wrong from day one
    of this check), and avoids having special code unlocking an array of inodes
    with potential duplicates. Attaching the inode first is the method used
    by xfs_rename and the other namespace methods all other error that require
    multiple locked inodes.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Lachlan McIlroy

    Christoph Hellwig
     

05 Dec, 2008

1 commit

  • When project quota is active and is being used for directory tree
    quota control, we disallow rename outside the current directory
    tree. This requires a check to be made after all the inodes
    involved in the rename are locked. We fail to unlock the inodes
    correctly if we disallow the rename when the target is outside the
    current directory tree. This results in a hang on the next access
    to the inodes involved in failed rename.

    Reported-by: Arkadiusz Miskiewicz
    Signed-off-by: Dave Chinner
    Tested-by: Arkadiusz Miskiewicz
    Signed-off-by: Lachlan McIlroy

    Dave Chinner
     

01 Dec, 2008

1 commit

  • i_gen is incremented in directory operations when the
    directory is changed. It is never read or otherwise used
    so it should be removed to help reduce the size of the
    struct xfs_inode.

    The patch also removes a duplicate logging of the directory
    inode core. We only need to do this once per transaction
    so kill the one associated with the i_gen increment.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Niv Sardi

    Dave Chinner
     

28 Jul, 2008

1 commit

  • As reported by Michael-John Turner XFS updates the mtime on the source
    inode of a rename call in case it's a directory and changes the parent.

    This doesn't make any sense, is not mentioned in the standards and not
    performed by any other Linux filesystems so remove it.

    SGI-PV: 983684

    SGI-Modid: xfs-linux-melb:xfs-kern:31364a

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Barry Naujok
    Signed-off-by: Lachlan McIlroy

    Christoph Hellwig
     

29 Apr, 2008

2 commits

  • Similar to to the previous patch for remove and rmdir only grab a
    reference to inodes when we join them to transaction to balance the
    decrement on transaction completion. Everything else it taken care of by
    the VFS.

    Note that the old case had leaks of inode count when src == target or src
    or target == one of the parent inodes, but these cases are fortunately
    already rejected by the VFS.

    SGI-PV: 976035
    SGI-Modid: xfs-linux-melb:xfs-kern:30904a

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Lachlan McIlroy

    Christoph Hellwig
     
  • ->rename already gets the target inode passed if it exits. Pass it down to
    xfs_rename so that we can avoid looking it up again. Also simplify locking
    as the first lock section in xfs_rename can go away now: the isdir is an
    invariant over the lifetime of the inode, and new_parent and the nlink
    check are namespace topology protected by i_mutex in the VFS. The projid
    check needs to move into the second lock section anyway to not be racy.

    Also kill the now unused xfs_dir_lookup_int and remove the now-unused
    first_locked argumet to xfs_lock_inodes.

    SGI-PV: 976035
    SGI-Modid: xfs-linux-melb:xfs-kern:30903a

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Lachlan McIlroy

    Christoph Hellwig
     

18 Apr, 2008

5 commits


07 Feb, 2008

2 commits

  • These are mostly locking annotations, marking things static, casts where
    needed and declaring stuff in header files.

    SGI-PV: 971186
    SGI-Modid: xfs-linux-melb:xfs-kern:30002a

    Signed-off-by: David Chinner
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Lachlan McIlroy

    David Chinner
     
  • Simplify vnode tracing calls by embedding function name & return addr in
    the calling macro.

    Also do a lot of vnode->inode renaming for consistency, while we're at it.

    SGI-PV: 970335
    SGI-Modid: xfs-linux-melb:xfs-kern:29650a

    Signed-off-by: Eric Sandeen
    Signed-off-by: Lachlan McIlroy
    Signed-off-by: Tim Shimmin

    Lachlan McIlroy
     

16 Oct, 2007

2 commits


15 Oct, 2007

3 commits

  • All vnode ops now take struct xfs_inode pointers and the behaviour related
    glue is split out into methods of it's own. This required fixing
    xfs_create/mkdir/symlink to not mess with the inode pointer but rather use
    a separate boolean for error handling. Thanks to Dave Chinner for that
    fix.

    SGI-PV: 969608
    SGI-Modid: xfs-linux-melb:xfs-kern:29492a

    Signed-off-by: Christoph Hellwig
    Signed-off-by: David Chinner
    Signed-off-by: Tim Shimmin

    Christoph Hellwig
     
  • One of the perpetual scaling problems XFS has is indexing it's incore
    inodes. We currently uses hashes and the default hash sizes chosen can
    only ever be a tradeoff between memory consumption and the maximum
    realistic size of the cache.

    As a result, anyone who has millions of inodes cached on a filesystem
    needs to tunes the size of the cache via the ihashsize mount option to
    allow decent scalability with inode cache operations.

    A further problem is the separate inode cluster hash, whose size is based
    on the ihashsize but is smaller, and so under certain conditions (sparse
    cluster cache population) this can become a limitation long before the
    inode hash is causing issues.

    The following patchset removes the inode hash and cluster hash and
    replaces them with radix trees to avoid the scalability limitations of the
    hashes. It also reduces the size of the inodes by 3 pointers....

    SGI-PV: 969561
    SGI-Modid: xfs-linux-melb:xfs-kern:29481a

    Signed-off-by: David Chinner
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Tim Shimmin

    David Chinner
     
  • SGI-PV: 968690
    SGI-Modid: xfs-linux-melb:xfs-kern:29340a

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Vlad Apostolov
    Signed-off-by: Tim Shimmin

    Christoph Hellwig
     

08 May, 2007

1 commit


10 Feb, 2007

1 commit

  • The firstblock argument to xfs_bmap_finish is not used by that function.
    Remove it and cleanup the code a bit.

    Patch provided by Eric Sandeen.

    SGI-PV: 960196
    SGI-Modid: xfs-linux-melb:xfs-kern:28034a

    Signed-off-by: Eric Sandeen
    Signed-off-by: David Chinner
    Signed-off-by: Tim Shimmin

    Eric Sandeen
     

20 Jun, 2006

1 commit


09 Jun, 2006

2 commits


08 May, 2006

1 commit


11 Jan, 2006

1 commit


02 Nov, 2005

3 commits