14 Mar, 2012

1 commit

  • Timestamps on regular files are the last metadata that XFS does not update
    transactionally. Now that we use the delaylog mode exclusively and made
    the log scode scale extremly well there is no need to bypass that code for
    timestamp updates. Logging all updates allows to drop a lot of code, and
    will allow for further performance improvements later on.

    Note that this patch drops optimized handling of fdatasync - it will be
    added back in a separate commit.

    Reviewed-by: Dave Chinner
    Signed-off-by: Christoph Hellwig
    Reviewed-by: Mark Tinguely
    Signed-off-by: Ben Myers

    Christoph Hellwig
     

26 Feb, 2012

1 commit

  • At the end of xfs_reclaim_inode(), the inode is locked in order to
    we wait for a possible concurrent lookup to complete before the
    inode is freed. This synchronization step was taking both the ILOCK
    and the IOLOCK, but the latter was causing lockdep to produce
    reports of the possibility of deadlock.

    It turns out that there's no need to acquire the IOLOCK at this
    point anyway. It may have been required in some earlier version of
    the code, but there should be no need to take the IOLOCK in
    xfs_iget(), so there's no (longer) any need to get it here for
    synchronization. Add an assertion in xfs_iget() as a reminder
    of this assumption.

    Dave Chinner diagnosed this on IRC, and Christoph Hellwig suggested
    no longer including the IOLOCK. I just put together the patch.

    Signed-off-by: Alex Elder
    Reviewed-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Ben Myers

    Alex Elder
     

18 Jan, 2012

1 commit

  • We almost never block on i_flock, the exception is synchronous inode
    flushing. Instead of bloating the inode with a 16/24-byte completion
    that we abuse as a semaphore just implement it as a bitlock that uses
    a bit waitqueue for the rare sleeping path. This primarily is a
    tradeoff between a much smaller inode and a faster non-blocking
    path vs faster wakeups, and we are much better off with the former.

    A small downside is that we will lose lockdep checking for i_flock, but
    given that it's always taken inside the ilock that should be acceptable.

    Note that for example the inode writeback locking is implemented in a
    very similar way.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Alex Elder
    Signed-off-by: Ben Myers

    Christoph Hellwig
     

09 Jan, 2012

1 commit

  • * 'for-linus' of git://oss.sgi.com/xfs/xfs: (22 commits)
    xfs: mark the xfssyncd workqueue as non-reentrant
    xfs: simplify xfs_qm_detach_gdquots
    xfs: fix acl count validation in xfs_acl_from_disk()
    xfs: remove unused XBT_FORCE_SLEEP bit
    xfs: remove XFS_QMOPT_DQSUSER
    xfs: kill xfs_qm_idtodq
    xfs: merge xfs_qm_dqinit_core into the only caller
    xfs: add a xfs_dqhold helper
    xfs: simplify xfs_qm_dqattach_grouphint
    xfs: nest qm_dqfrlist_lock inside the dquot qlock
    xfs: flatten the dquot lock ordering
    xfs: implement lazy removal for the dquot freelist
    xfs: remove XFS_DQ_INACTIVE
    xfs: cleanup xfs_qm_dqlookup
    xfs: cleanup dquot locking helpers
    xfs: remove the sync_mode argument to xfs_qm_dqflush_all
    xfs: remove xfs_qm_sync
    xfs: make sure to really flush all dquots in xfs_qm_quotacheck
    xfs: untangle SYNC_WAIT and SYNC_TRYLOCK meanings for xfs_qm_dqflush
    xfs: remove the lid_size field in struct log_item_desc
    ...

    Fix up trivial conflict in fs/xfs/xfs_sync.c

    Linus Torvalds
     

24 Dec, 2011

1 commit

  • Since Linux 2.6.36 the writeback code has introduces various measures for
    live lock prevention during sync(). Unfortunately some of these are
    actively harmful for the XFS model, where the inode gets marked dirty for
    metadata from the data I/O handler.

    The older_than_this checks that are now more strictly enforced since

    writeback: avoid livelocking WB_SYNC_ALL writeback

    by only calling into __writeback_inodes_sb and thus only sampling the
    current cut off time once. But on a slow enough devices the previous
    asynchronous sync pass might not have fully completed yet, and thus XFS
    might mark metadata dirty only after that sampling of the cut off time for
    the blocking pass already happened. I have not myself reproduced this
    myself on a real system, but by introducing artificial delay into the
    XFS I/O completion workqueues it can be reproduced easily.

    Fix this by iterating over all XFS inodes in ->sync_fs and log all that
    are dirty. This might log inode that only got redirtied after the
    previous pass, but given how cheap delayed logging of inodes is it
    isn't a major concern for performance.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Tested-by: Mark Tinguely
    Reviewed-by: Mark Tinguely
    Signed-off-by: Ben Myers

    Christoph Hellwig
     

13 Dec, 2011

1 commit

  • Now that we can't have any dirty dquots around that aren't in the AIL we
    can get rid of the explicit dquot syncing from xfssyncd and xfs_fs_sync_fs
    and instead rely on AIL pushing to write out any quota updates.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Ben Myers

    Christoph Hellwig
     

30 Nov, 2011

1 commit

  • If we are doing synchronous inode reclaim we block the VM from making
    progress in memory reclaim. So if we encouter a flush locked inode
    promote it in the delwri list and wake up xfsbufd to write it out now.
    Without this we can get hangs of up to 30 seconds during workloads hitting
    synchronous inode reclaim.

    The scheme is copied from what we do for dquot reclaims.

    Reported-by: Simon Kirby
    Signed-off-by: Christoph Hellwig
    Tested-by: Simon Kirby
    Signed-off-by: Ben Myers

    Christoph Hellwig
     

12 Oct, 2011

3 commits

  • Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Alex Elder

    Christoph Hellwig
     
  • We now have an i_dio_count filed and surrounding infrastructure to wait
    for direct I/O completion instead of i_icount, and we have never needed
    to iocount waits for buffered I/O given that we only set the page uptodate
    after finishing all required work. Thus remove i_iocount, and replace
    the actually needed waits with calls to inode_dio_wait.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Alex Elder

    Christoph Hellwig
     
  • Remove the xfs_buf_relse from xfs_bwrite and let the caller handle it to
    mirror the delwri and read paths.

    Also remove the mount pointer passed to xfs_bwrite, which is superflous now
    that we have a mount pointer in the buftarg.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

13 Aug, 2011

1 commit

  • Use the move from Linux 2.6 to Linux 3.x as an excuse to kill the
    annoying subdirectories in the XFS source code. Besides the large
    amount of file rename the only changes are to the Makefile, a few
    files including headers with the subdirectory prefix, and the binary
    sysctl compat code that includes a header under fs/xfs/ from
    kernel/.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig