20 Jul, 2016

2 commits

  • Dave Chinner
     
  • The upcoming buftarg I/O accounting mechanism maintains a count of
    all buffers that have undergone I/O in the current hold-release
    cycle. Certain buffers associated with core infrastructure (e.g.,
    the xfs_mount superblock buffer, log buffers) are never released,
    however. This means that accounting I/O submission on such buffers
    elevates the buftarg count indefinitely and could lead to lockup on
    unmount.

    Define a new buffer flag to explicitly exclude buffers from buftarg
    I/O accounting. Set the flag on the superblock and associated log
    buffers.

    Signed-off-by: Brian Foster
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Brian Foster
     

01 Jun, 2016

1 commit

  • Al Viro noticed that xfs_lock_inodes should be static, and
    that led to ... a few more.

    These are just the easy ones, others require moving functions
    higher in source files, so that's not done here to keep
    this review simple.

    Signed-off-by: Eric Sandeen
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Dave Chinner

    Eric Sandeen
     

20 May, 2016

1 commit


06 Apr, 2016

2 commits


07 Mar, 2016

1 commit


10 Feb, 2016

4 commits


05 Jan, 2016

2 commits

  • Dave Chinner
     
  • XFS now uses CRC verification over a limited section of the log to
    detect torn writes prior to a crash. This is difficult to test directly
    due to the timing and hardware requirements to cause a short write.

    Add a mechanism to inject CRC errors into log records to facilitate
    testing torn write detection during log recovery. This mechanism is
    dangerous and can result in filesystem corruption. Thus, it is only
    available in DEBUG mode for testing/development purposes. Set a non-zero
    value to the following sysfs entry to enable error injection:

    /sys/fs/xfs//log/log_badcrc_factor

    Once enabled, XFS intentionally writes an invalid CRC to a log record at
    some random point in the future based on the provided frequency. The
    filesystem immediately shuts down once the record has been written to
    the physical log to prevent metadata writeback (e.g., AIL insertion)
    once the log write completes. This helps reasonably simulate a torn
    write to the log as the affected record must be safe to discard. The
    next mount after the intentional shutdown requires log recovery and
    should detect and recover from the torn write.

    Note again that this _will_ result in data loss or worse. For testing
    and development purposes only!

    Signed-off-by: Brian Foster
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Brian Foster
     

04 Jan, 2016

1 commit

  • Update the log ticket reservation type printing code to reflect
    all the types of log tickets, to avoid incorrect debug output and
    avoid running off the end of the array.

    Signed-off-by: Darrick J. Wong
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Darrick J. Wong
     

12 Oct, 2015

4 commits

  • Dave Chinner
     
  • This patch modifies the stats counting macros and the callers
    to those macros to properly increment, decrement, and add-to
    the xfs stats counts. The counts for global and per-fs stats
    are correctly advanced, and cleared by writing a "1" to the
    corresponding clear file.

    global counts: /sys/fs/xfs/stats/stats
    per-fs counts: /sys/fs/xfs/sda*/stats/stats

    global clear: /sys/fs/xfs/stats/stats_clear
    per-fs clear: /sys/fs/xfs/sda*/stats/stats_clear

    [dchinner: cleaned up macro variables, removed CONFIG_FS_PROC around
    stats structures and macros. ]

    Signed-off-by: Bill O'Donnell
    Reviewed-by: Eric Sandeen
    Signed-off-by: Dave Chinner

    Bill O'Donnell
     
  • The gcc undefined behavior sanitizer caught this; surely
    any sane memcpy implementation will no-op if size == 0,
    but behavior with a *src of NULL is technically undefined
    (declared nonnull), so avoid it here.

    We are actually in this situation frequently via
    xlog_commit_record(), because:

    struct xfs_log_iovec reg = {
    .i_addr = NULL,
    .i_len = 0,
    .i_type = XLOG_REG_TYPE_COMMIT,
    };

    Reported-by: Eric Sandeen
    Signed-off-by: Dave Chinner
    Reviewed-by: Eric Sandeen
    Signed-off-by: Dave Chinner

    Eric Sandeen
     
  • Since the onset of v5 superblocks, the LSN of the last modification has
    been included in a variety of on-disk data structures. This LSN is used
    to provide log recovery ordering guarantees (e.g., to ensure an older
    log recovery item is not replayed over a newer target data structure).

    While this works correctly from the point a filesystem is formatted and
    mounted, userspace tools have some problematic behaviors that defeat
    this mechanism. For example, xfs_repair historically zeroes out the log
    unconditionally (regardless of whether corruption is detected). If this
    occurs, the LSN of the filesystem is reset and the log is now in a
    problematic state with respect to on-disk metadata structures that might
    have a larger LSN. Until either the log catches up to the highest
    previously used metadata LSN or each affected data structure is modified
    and written out without incident (which resets the metadata LSN), log
    recovery is susceptible to filesystem corruption.

    This problem is ultimately addressed and repaired in the associated
    userspace tools. The kernel is still responsible to detect the problem
    and notify the user that something is wrong. Check the superblock LSN at
    mount time and fail the mount if it is invalid. From that point on,
    trigger verifier failure on any metadata I/O where an invalid LSN is
    detected. This results in a filesystem shutdown and guarantees that we
    do not log metadata changes with invalid LSNs on disk. Since this is a
    known issue with a known recovery path, present a warning to instruct
    the user how to recover.

    Signed-off-by: Brian Foster
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Brian Foster
     

19 Aug, 2015

3 commits

  • Dave Chinner
     
  • The first 4 bytes of every basic block in the physical log is stamped
    with the current lsn. To support this mechanism, the log record header
    (first block of each new log record) contains space for the original
    first byte of each log record block before it is replaced with the lsn.
    The log record header has space for 32k worth of blocks. The version 2
    log adds new extended record headers for each additional 32k worth of
    blocks beyond what is supported by the record header.

    The log record checksum incorporates the log record header, the extended
    headers and the record payload. xlog_cksum() checksums the extended
    headers based on log->l_iclog_heads, which specifies the number of
    extended headers in a log record based on the log buffer size mount
    option. The log buffer size is variable, however, and thus means the
    checksum can be calculated differently based on how a filesystem is
    mounted. This is problematic if a filesystem crashes and recovery occurs
    on a subsequent mount using a different log buffer size. For example,
    crash an active filesystem that is mounted with the default (32k)
    logbsize, attempt remount/recovery using '-o logbsize=64k' and the mount
    fails on or warns about log checksum failures.

    To avoid this problem, update xlog_cksum() to calculate the checksum
    based on the size of the log buffer according to the log record. The
    size is already included in the h_size field of the log record header
    and thus is available at log recovery time. Extended log record headers
    are also only written when the log record is large enough to require
    them. This makes checksum calculation of log records consistent with the
    extended record header mechanism as well as how on-disk records are
    checksummed with various log buffer size mount options.

    Signed-off-by: Brian Foster
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Brian Foster
     
  • Log recovery occurs in two phases at mount time. In the first phase,
    EFIs and EFDs are processed and potentially cancelled out. EFIs without
    EFD objects are inserted into the AIL for processing and recovery in the
    second phase. xfs_mountfs() runs various other operations between the
    phases and is thus subject to failure. If failure occurs after the first
    phase but before the second, pending EFIs sit on the AIL, pin it and
    cause the mount to hang.

    Update the mount sequence to ensure that pending EFIs are cancelled in
    the event of failure. Add a recovery cancellation mechanism to iterate
    the AIL and cancel all EFI items when requested. Plumb cancellation
    support through the log mount finish helper and update xfs_mountfs() to
    invoke cancellation in the event of failure after recovery has started.

    Signed-off-by: Brian Foster
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Brian Foster
     

29 Jul, 2015

1 commit

  • The second and subsequent lines of multi-line logging messages
    are not prefixed with the same information as the first line.

    Separate messages with newlines into multiple calls to ensure
    consistent prefixing and allow easier grep use.

    Signed-off-by: Joe Perches
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Joe Perches
     

23 Jun, 2015

1 commit


22 Jun, 2015

3 commits


04 Jun, 2015

1 commit

  • Instead of the confusing flags argument pass a boolean flag to indicate if
    we want to release or regrant a log reservation.

    Also ensure that xfs_log_done always drop the reference on the log ticket,
    to both simplify the code and make the logic in xfs_trans_roll easier
    to understand.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Christoph Hellwig
     

22 Jan, 2015

2 commits

  • Conflicts:
    fs/xfs/xfs_mount.c

    Dave Chinner
     
  • We now have several superblock loggin functions that are identical
    except for the transaction reservation and whether it shoul dbe a
    synchronous transaction or not. Consolidate these all into a single
    function, a single reserveration and a sync flag and call it
    xfs_sync_sb().

    Also, xfs_mod_sb() is not really a modification function - it's the
    operation of logging the superblock buffer. hence change the name of
    it to reflect this.

    Note that we have to change the mp->m_update_flags that are passed
    around at mount time to a boolean simply to indicate a superblock
    update is needed.

    Signed-off-by: Dave Chinner
    Reviewed-by: Brian Foster
    Signed-off-by: Dave Chinner

    Dave Chinner
     

24 Dec, 2014

2 commits

  • xfs_warn() and friends add a newline by default, but some
    messages add another one.

    Particularly for the failing write message below, this can
    waste a lot of console real estate!

    Signed-off-by: Eric Sandeen
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Dave Chinner

    Eric Sandeen
     
  • Log buffer I/O completion passes through the high priority
    m_log_workqueue rather than the default metadata buffer workqueue. The
    log buffer wq is initialized at I/O submission time. The log buffers are
    reused once initialized, however, so this is not necessary.

    Initialize the log buffer I/O completion workqueue pointers once when
    the log is allocated and log buffers initialized rather than on every
    log buffer I/O submission.

    Signed-off-by: Brian Foster
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Brian Foster
     

04 Dec, 2014

2 commits

  • Conflicts:
    fs/xfs/xfs_iops.c

    Dave Chinner
     
  • XFS traditionally sends all buffer I/O completion work to a single
    workqueue. This includes metadata buffer completion and log buffer
    completion. The log buffer completion requires a high priority queue to
    prevent stalls due to log forces getting stuck behind other queued work.

    Rather than continue to prioritize all buffer I/O completion due to the
    needs of log completion, split log buffer completion off to
    m_log_workqueue and move the high priority flag from m_buf_workqueue to
    m_log_workqueue.

    Add a b_ioend_wq wq pointer to xfs_buf to allow completion workqueue
    customization on a per-buffer basis. Initialize b_ioend_wq to
    m_buf_workqueue by default in the generic buffer I/O submission path.
    Finally, override the default wq with the high priority m_log_workqueue
    in the log buffer I/O submission path.

    Signed-off-by: Brian Foster
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Brian Foster
     

28 Nov, 2014

4 commits

  • Dave Chinner
     
  • More on-disk format consolidation.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Christoph Hellwig
     
  • More on-disk format consolidation. A few declarations that weren't on-disk
    format related move into better suitable spots.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Christoph Hellwig
     
  • The expectation since the introduction the lazy superblock counters is
    that the counters are synced and superblock logged appropriately as part
    of the filesystem freeze sequence. This does not occur, however, due to
    the logic in xfs_fs_writable() that prevents progress when the fs is in
    any state other than SB_UNFROZEN.

    While this is a bug, it has not been exposed to date because the last
    thing XFS does during freeze is dirty the log. The log recovery process
    recalculates the counters from AGI/AGF metadata to ensure everything is
    correct. Therefore should a crash occur while an fs is frozen, the
    subsequent log recovery puts everything back in order. See the following
    commit for reference:

    92821e2b [XFS] Lazy Superblock Counters

    We might not always want to rely on dirtying the log on a frozen fs.
    Modify xfs_log_sbcount() to proceed when the filesystem is freezing but
    not once the freeze process has completed. Modify xfs_fs_writable() to
    accept the minimum freeze level for which modifications should be
    blocked to support various codepaths.

    Signed-off-by: Brian Foster
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Brian Foster
     

02 Oct, 2014

3 commits

  • There is a lot of cookie-cutter code that looks like:

    if (shutdown)
    handle buffer error
    xfs_buf_iorequest(bp)
    error = xfs_buf_iowait(bp)
    if (error)
    handle buffer error

    spread through XFS. There's significant complexity now in
    xfs_buf_iorequest() to specifically handle this sort of synchronous
    IO pattern, but there's all sorts of nasty surprises in different
    error handling code dependent on who owns the buffer references and
    the locks.

    Pull this pattern into a single helper, where we can hide all the
    synchronous IO warts and hence make the error handling for all the
    callers much saner. This removes the need for a special extra
    reference to protect IO completion processing, as we can now hold a
    single reference across dispatch and waiting, simplifying the sync
    IO smeantics and error handling.

    In doing this, also rename xfs_buf_iorequest to xfs_buf_submit and
    make it explicitly handle on asynchronous IO. This forces all users
    to be switched specifically to one interface or the other and
    removes any ambiguity between how the interfaces are to be used. It
    also means that xfs_buf_iowait() goes away.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Dave Chinner

    Dave Chinner
     
  • We do some work in xfs_buf_ioend, and some work in
    xfs_buf_iodone_work, but much of that functionality is the same.
    This work can all be done in a single function, leaving
    xfs_buf_iodone just a wrapper to determine if we should execute it
    by workqueue or directly. hence rename xfs_buf_iodone_work to
    xfs_buf_ioend(), and add a new xfs_buf_ioend_async() for places that
    need async processing.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Dave Chinner

    Dave Chinner
     
  • When we have marked the filesystem for shutdown, we want to prevent
    any further buffer IO from being submitted. However, we currently
    force the log after marking the filesystem as shut down, hence
    allowing IO to the log *after* we have marked both the filesystem
    and the log as in an error state.

    Clean this up by forcing the log before we mark the filesytem with
    an error. This replaces the pure CIL flush that we currently have
    which works around this same issue (i.e the CIL can't be flushed
    once the shutdown flags are set) and hence enables us to clean up
    the logic substantially.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Dave Chinner

    Dave Chinner