04 Feb, 2018

1 commit

  • [ Upstream commit 509955823cc9cc225c05673b1b83d70ca70c5c60 ]

    As part of testing log recovery with dm_log_writes, Amir Goldstein
    discovered an error in the deferred ops recovery that lead to corruption
    of the filesystem metadata if a reflink+rmap filesystem happened to shut
    down midway through a CoW remap:

    "This is what happens [after failed log recovery]:

    "Phase 1 - find and verify superblock...
    "Phase 2 - using internal log
    " - zero log...
    " - scan filesystem freespace and inode maps...
    " - found root inode chunk
    "Phase 3 - for each AG...
    " - scan (but don't clear) agi unlinked lists...
    " - process known inodes and perform inode discovery...
    " - agno = 0
    "data fork in regular inode 134 claims CoW block 376
    "correcting nextents for inode 134
    "bad data fork in inode 134
    "would have cleared inode 134"

    Hou Tao dissected the log contents of exactly such a crash:

    "According to the implementation of xfs_defer_finish(), these ops should
    be completed in the following sequence:

    "Have been done:
    "(1) CUI: Oper (160)
    "(2) BUI: Oper (161)
    "(3) CUD: Oper (194), for CUI Oper (160)
    "(4) RUI A: Oper (197), free rmap [0x155, 2, -9]

    "Should be done:
    "(5) BUD: for BUI Oper (161)
    "(6) RUI B: add rmap [0x155, 2, 137]
    "(7) RUD: for RUI A
    "(8) RUD: for RUI B

    "Actually be done by xlog_recover_process_intents()
    "(5) BUD: for BUI Oper (161)
    "(6) RUI B: add rmap [0x155, 2, 137]
    "(7) RUD: for RUI B
    "(8) RUD: for RUI A

    "So the rmap entry [0x155, 2, -9] for COW should be freed firstly,
    then a new rmap entry [0x155, 2, 137] will be added. However, as we can see
    from the log record in post_mount.log (generated after umount) and the trace
    print, the new rmap entry [0x155, 2, 137] are added firstly, then the rmap
    entry [0x155, 2, -9] are freed."

    When reconstructing the internal log state from the log items found on
    disk, it's required that deferred ops replay in exactly the same order
    that they would have had the filesystem not gone down. However,
    replaying unfinished deferred ops can create /more/ deferred ops. These
    new deferred ops are finished in the wrong order. This causes fs
    corruption and replay crashes, so let's create a single defer_ops to
    handle the subsequent ops created during replay, then use one single
    transaction at the end of log recovery to ensure that everything is
    replayed in the same order as they're supposed to be.

    Reported-by: Amir Goldstein
    Analyzed-by: Hou Tao
    Reviewed-by: Christoph Hellwig
    Tested-by: Amir Goldstein
    Signed-off-by: Darrick J. Wong
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Darrick J. Wong
     

20 Dec, 2017

1 commit

  • [ Upstream commit 9f2a4505800607e537e9dd9dea4f55c4b0c30c7a ]

    It is possible for mkfs to format very small filesystems with too
    small of an internal log with respect to the various minimum size
    and block count requirements. If this occurs when the log happens to
    be smaller than the scan window used for cycle verification and the
    scan wraps the end of the log, the start_blk calculation in
    xlog_find_head() underflows and leads to an attempt to scan an
    invalid range of log blocks. This results in log recovery failure
    and a failed mount.

    Since there may be filesystems out in the wild with this kind of
    geometry, we cannot simply refuse to mount. Instead, cap the scan
    window for cycle verification to the size of the physical log. This
    ensures that the cycle verification proceeds as expected when the
    scan wraps the end of the log.

    Reported-by: Zorro Lang
    Signed-off-by: Brian Foster
    Reviewed-by: Darrick J. Wong
    Signed-off-by: Darrick J. Wong
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Brian Foster
     

02 Sep, 2017

1 commit


23 Aug, 2017

5 commits

  • Torn write detection and tail overwrite detection can shift the log
    head and tail respectively in the event of CRC mismatch or
    corruption errors. Add a high-level log recovery tracepoint to dump
    the final log head/tail and make those values easily attainable in
    debug/diagnostic situations.

    Signed-off-by: Brian Foster
    Reviewed-by: Darrick J. Wong
    Signed-off-by: Darrick J. Wong

    Brian Foster
     
  • Torn write and tail overwrite detection both trigger only on
    -EFSBADCRC errors. While this is the most likely failure scenario
    for each condition, -EFSCORRUPTED is still possible in certain cases
    depending on what ends up on disk when a torn write or partial tail
    overwrite occurs. For example, an invalid log record h_len can lead
    to an -EFSCORRUPTED error when running the log recovery CRC pass.

    Therefore, update log head and tail verification to trigger the
    associated head/tail fixups in the event of -EFSCORRUPTED errors
    along with -EFSBADCRC. Also, -EFSCORRUPTED can currently be returned
    from xlog_do_recovery_pass() before rhead_blk is initialized if the
    first record encountered happens to be corrupted. This leads to an
    incorrect 'first_bad' return value. Initialize rhead_blk earlier in
    the function to address that problem as well.

    Signed-off-by: Brian Foster
    Reviewed-by: Darrick J. Wong
    Signed-off-by: Darrick J. Wong

    Brian Foster
     
  • If we consider the case where the tail (T) of the log is pinned long
    enough for the head (H) to push and block behind the tail, we can
    end up blocked in the following state without enough free space (f)
    in the log to satisfy a transaction reservation:

    0 phys. log N
    [-------HffT---H'--T'---]

    The last good record in the log (before H) refers to T. The tail
    eventually pushes forward (T') leaving more free space in the log
    for writes to H. At this point, suppose space frees up in the log
    for the maximum of 8 in-core log buffers to start flushing out to
    the log. If this pushes the head from H to H', these next writes
    overwrite the previous tail T. This is safe because the items logged
    from T to T' have been written back and removed from the AIL.

    If the next log writes (H -> H') happen to fail and result in
    partial records in the log, the filesystem shuts down having
    overwritten T with invalid data. Log recovery correctly locates H on
    the subsequent mount, but H still refers to the now corrupted tail
    T. This results in log corruption errors and recovery failure.

    Since the tail overwrite results from otherwise correct runtime
    behavior, it is up to log recovery to try and deal with this
    situation. Update log recovery tail verification to run a CRC pass
    from the first record past the tail to the head. This facilitates
    error detection at T and moves the recovery tail to the first good
    record past H' (similar to truncating the head on torn write
    detection). If corruption is detected beyond the range possibly
    affected by the max number of iclogs, the log is legitimately
    corrupted and log recovery failure is expected.

    Signed-off-by: Brian Foster
    Reviewed-by: Darrick J. Wong
    Signed-off-by: Darrick J. Wong

    Brian Foster
     
  • Log tail verification currently only occurs when torn writes are
    detected at the head of the log. This was introduced because a
    change in the head block due to torn writes can lead to a change in
    the tail block (each log record header references the current tail)
    and the tail block should be verified before log recovery proceeds.

    Tail corruption is possible outside of torn write scenarios,
    however. For example, partial log writes can be detected and cleared
    during the initial head/tail block discovery process. If the partial
    write coincides with a tail overwrite, the log tail is corrupted and
    recovery fails.

    To facilitate correct handling of log tail overwites, update log
    recovery to always perform tail verification. This is necessary to
    detect potential tail overwrite conditions when torn writes may not
    have occurred. This changes normal (i.e., no torn writes) recovery
    behavior slightly to detect and return CRC related errors near the
    tail before actual recovery starts.

    Signed-off-by: Brian Foster
    Reviewed-by: Darrick J. Wong
    Signed-off-by: Darrick J. Wong

    Brian Foster
     
  • The high-level log recovery algorithm consists of two loops that
    walk the physical log and process log records from the tail to the
    head. The first loop handles the case where the tail is beyond the
    head and processes records up to the end of the physical log. The
    subsequent loop processes records from the beginning of the physical
    log to the head.

    Because log records can wrap around the end of the physical log, the
    first loop mentioned above must handle this case appropriately.
    Records are processed from in-core buffers, which means that this
    algorithm must split the reads of such records into two partial
    I/Os: 1.) from the beginning of the record to the end of the log and
    2.) from the beginning of the log to the end of the record. This is
    further complicated by the fact that the log record header and log
    record data are read into independent buffers.

    The current handling of each buffer correctly splits the reads when
    either the header or data starts before the end of the log and wraps
    around the end. The data read does not correctly handle the case
    where the prior header read wrapped or ends on the physical log end
    boundary. blk_no is incremented to or beyond the log end after the
    header read to point to the record data, but the split data read
    logic triggers, attempts to read from an invalid log block and
    ultimately causes log recovery to fail. This can be reproduced
    fairly reliably via xfstests tests generic/047 and generic/388 with
    large iclog sizes (256k) and small (10M) logs.

    If the record header read has pushed beyond the end of the physical
    log, the subsequent data read is actually contiguous. Update the
    data read logic to detect the case where blk_no has wrapped, mod it
    against the log size to read from the correct address and issue one
    contiguous read for the log data buffer. The log record is processed
    as normal from the buffer(s), the loop exits after the current
    iteration and the subsequent loop picks up with the first new record
    after the start of the log.

    Signed-off-by: Brian Foster
    Reviewed-by: Darrick J. Wong
    Signed-off-by: Darrick J. Wong

    Brian Foster
     

11 Jul, 2017

1 commit

  • Pull XFS updates from Darrick Wong:
    "Here are some changes for you for 4.13. For the most part it's fixes
    for bugs and deadlock problems, and preparation for online fsck in
    some future merge window.

    - Avoid quotacheck deadlocks

    - Fix transaction overflows when bunmapping fragmented files

    - Refactor directory readahead

    - Allow admin to configure if ASSERT is fatal

    - Improve transaction usage detail logging during overflows

    - Minor cleanups

    - Don't leak log items when the log shuts down

    - Remove double-underscore typedefs

    - Various preparation for online scrubbing

    - Introduce new error injection configuration sysfs knobs

    - Refactor dq_get_next to use extent map directly

    - Fix problems with iterating the page cache for unwritten data

    - Implement SEEK_{HOLE,DATA} via iomap

    - Refactor XFS to use iomap SEEK_HOLE and SEEK_DATA

    - Don't use MAXPATHLEN to check on-disk symlink target lengths"

    * tag 'xfs-4.13-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (48 commits)
    xfs: don't crash on unexpected holes in dir/attr btrees
    xfs: rename MAXPATHLEN to XFS_SYMLINK_MAXLEN
    xfs: fix contiguous dquot chunk iteration livelock
    xfs: Switch to iomap for SEEK_HOLE / SEEK_DATA
    vfs: Add iomap_seek_hole and iomap_seek_data helpers
    vfs: Add page_cache_seek_hole_data helper
    xfs: remove a whitespace-only line from xfs_fs_get_nextdqblk
    xfs: rewrite xfs_dq_get_next_id using xfs_iext_lookup_extent
    xfs: Check for m_errortag initialization in xfs_errortag_test
    xfs: grab dquots without taking the ilock
    xfs: fix semicolon.cocci warnings
    xfs: Don't clear SGID when inheriting ACLs
    xfs: free cowblocks and retry on buffered write ENOSPC
    xfs: replace log_badcrc_factor knob with error injection tag
    xfs: convert drop_writes to use the errortag mechanism
    xfs: remove unneeded parameter from XFS_TEST_ERROR
    xfs: expose errortag knobs via sysfs
    xfs: make errortag a per-mountpoint structure
    xfs: free uncommitted transactions during log recovery
    xfs: don't allow bmap on rt files
    ...

    Linus Torvalds
     

25 Jun, 2017

1 commit

  • Log recovery allocates in-core transaction and member item data
    structures on-demand as it processes the on-disk log. Transactions
    are allocated on first encounter on-disk and stored in a hash table
    structure where they are easily accessible for subsequent lookups.
    Transaction items are also allocated on demand and are attached to
    the associated transactions.

    When a commit record is encountered in the log, the transaction is
    committed to the fs and the in-core structures are freed. If a
    filesystem crashes or shuts down before all in-core log buffers are
    flushed to the log, however, not all transactions may have commit
    records in the log. As expected, the modifications in such an
    incomplete transaction are not replayed to the fs. The in-core data
    structures for the partial transaction are never freed, however,
    resulting in a memory leak.

    Update xlog_do_recovery_pass() to first correctly initialize the
    hash table array so empty lists can be distinguished from populated
    lists on function exit. Update xlog_recover_free_trans() to always
    remove the transaction from the list prior to freeing the associated
    memory. Finally, walk the hash table of transaction lists as the
    last step before it goes out of scope and free any transactions that
    may remain on the lists. This prevents a memory leak of partial
    transactions in the log.

    Signed-off-by: Brian Foster
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Darrick J. Wong
    Signed-off-by: Darrick J. Wong

    Brian Foster
     

20 Jun, 2017

1 commit

  • This is a purely mechanical patch that removes the private
    __{u,}int{8,16,32,64}_t typedefs in favor of using the system
    {u,}int{8,16,32,64}_t typedefs. This is the sed script used to perform
    the transformation and fix the resulting whitespace and indentation
    errors:

    s/typedef\t__uint8_t/typedef __uint8_t\t/g
    s/typedef\t__uint/typedef __uint/g
    s/typedef\t__int\([0-9]*\)_t/typedef int\1_t\t/g
    s/__uint8_t\t/__uint8_t\t\t/g
    s/__uint/uint/g
    s/__int\([0-9]*\)_t\t/__int\1_t\t\t/g
    s/__int/int/g
    /^typedef.*int[0-9]*_t;$/d

    Signed-off-by: Darrick J. Wong
    Reviewed-by: Christoph Hellwig

    Darrick J. Wong
     

05 Jun, 2017

1 commit

  • Use the common helper uuid_is_null() and remove the xfs specific
    helper uuid_is_nil().

    The common helper does not check for the NULL pointer value as
    xfs helper did, but xfs code never calls the helper with a pointer
    that can be NULL.

    Conform comments and warning strings to use the term 'null uuid'
    instead of 'nil uuid', because this is the terminology used by
    lib/uuid.c and its users. It is also the terminology used in
    userspace by libuuid and xfsprogs.

    Signed-off-by: Amir Goldstein
    [hch: remove now unused uuid.[ch]]
    Signed-off-by: Christoph Hellwig
    Reviewed-by: Brian Foster
    Reviewed-by: Andy Shevchenko

    Amir Goldstein
     

09 May, 2017

1 commit

  • Fix typos and add the following to the scripts/spelling.txt:

    intialisation||initialisation
    intialised||initialised
    intialise||initialise

    This commit does not intend to change the British spelling itself.

    Link: http://lkml.kernel.org/r/1481573103-11329-18-git-send-email-yamada.masahiro@socionext.com
    Signed-off-by: Masahiro Yamada
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Masahiro Yamada
     

07 Dec, 2016

1 commit


05 Dec, 2016

3 commits

  • Nick Piggin reported that the CRC overhead in an fsync heavy
    workload was higher than expected on a Power8 machine. Part of this
    was to do with the fact that the power8 CRC implementation is not
    efficient for CRC lengths of less than 512 bytes, and so the way we
    split the CRCs over the CRC field means a lot of the CRCs are
    reduced to being less than than optimal size.

    To optimise this, change the CRC update mechanism to zero the CRC
    field first, and then compute the CRC in one pass over the buffer
    and write the result back into the buffer. We can do this safely
    because anything writing a CRC has exclusive access to the buffer
    the CRC is being calculated over.

    We leave the CRC verify code the same - it still splits the CRC
    calculation - because we do not want read-only operations modifying
    the underlying buffer. This is because read-only operations may not
    have an exclusive access to the buffer guaranteed, and so temporary
    modifications could leak out to to other processes accessing the
    buffer concurrently.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Dave Chinner

    Dave Chinner
     
  • We've missed properly setting the buffer type for
    an AGI transaction in 3 spots now, so just move it
    into xfs_read_agi() and set it if we are in a transaction
    to avoid the problem in the future.

    This is similar to how it is done in i.e. the dir3
    and attr3 read functions.

    Signed-off-by: Eric Sandeen
    Reviewed-by: Brian Foster
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Dave Chinner

    Eric Sandeen
     
  • xlog_recover_clear_agi_bucket didn't set the
    type to XFS_BLFT_AGI_BUF, so we got a warning during log
    replay (or an ASSERT on a debug build).

    XFS (md0): Unknown buffer type 0!
    XFS (md0): _xfs_buf_ioapply: no ops on block 0xaea8802/0x1

    Fix this, as was done in f19b872b for 2 other locations
    with the same problem.

    cc: # 3.10 to current
    Signed-off-by: Eric Sandeen
    Reviewed-by: Brian Foster
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Dave Chinner

    Eric Sandeen
     

08 Nov, 2016

1 commit


05 Oct, 2016

2 commits

  • Log recovery will iget an inode to replay BUI items and iput the inode
    when it's done. Unfortunately, if the inode was unlinked, the iput
    will see that i_nlink == 0 and decide to truncate & free the inode,
    which prevents us from replaying subsequent BUIs. We can't skip the
    BUIs because we have to replay all the redo items to ensure that
    atomic operations complete.

    Since unlinked inode recovery will reap the inode anyway, we can
    safely introduce a new inode flag to indicate that an inode is in this
    'unlinked recovery' state and should not be auto-reaped in the
    drop_inode path.

    Signed-off-by: Darrick J. Wong
    Reviewed-by: Christoph Hellwig

    Darrick J. Wong
     
  • Provide a mechanism for higher levels to create BUI/BUD items, submit
    them to the log, and a stub function to deal with recovered BUI items.
    These parts will be connected to the rmapbt in a later patch.

    Signed-off-by: Darrick J. Wong
    Reviewed-by: Christoph Hellwig

    Darrick J. Wong
     

04 Oct, 2016

2 commits


26 Sep, 2016

5 commits

  • Log recovery has particular rules around buffer submission along with
    tricky corner cases where independent transactions can share an LSN. As
    such, it can be difficult to follow when/why buffers are submitted
    during recovery.

    Add a couple tracepoints to post the current LSN of a record when a new
    record is being processed and when a buffer is being skipped due to LSN
    ordering. Also, update the recover item class to include the LSN of the
    current transaction for the item being processed.

    Signed-off-by: Brian Foster
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Brian Foster
     
  • Log recovery is currently broken for v5 superblocks in that it never
    updates the metadata LSN of buffers written out during recovery. The
    metadata LSN is recorded in various bits of metadata to provide recovery
    ordering criteria that prevents transient corruption states reported by
    buffer write verifiers. Without such ordering logic, buffer updates can
    be replayed out of order and lead to false positive transient corruption
    states. This is generally not a corruption vector on its own, but
    corruption detection shuts down the filesystem and ultimately prevents a
    mount if it occurs during log recovery. This requires an xfs_repair run
    that clears the log and potentially loses filesystem updates.

    This problem is avoided in most cases as metadata writes during normal
    filesystem operation update the metadata LSN appropriately. The problem
    with log recovery not updating metadata LSNs manifests if the system
    happens to crash shortly after log recovery itself. In this scenario, it
    is possible for log recovery to complete all metadata I/O such that the
    filesystem is consistent. If a crash occurs after that point but before
    the log tail is pushed forward by subsequent operations, however, the
    next mount performs the same log recovery over again. If a buffer is
    updated multiple times in the dirty range of the log, an earlier update
    in the log might not be valid based on the current state of the
    associated buffer after all of the updates in the log had been replayed
    (before the previous crash). If a verifier happens to detect such a
    problem, the filesystem claims corruption and immediately shuts down.

    This commonly manifests in practice as directory block verifier failures
    such as the following, likely due to directory verifiers being
    particularly detailed in their checks as compared to most others:

    ...
    Mounting V5 Filesystem
    XFS (dm-0): Starting recovery (logdev: internal)
    XFS (dm-0): Internal error XFS_WANT_CORRUPTED_RETURN at line ... of \
    file fs/xfs/libxfs/xfs_dir2_data.c. Caller xfs_dir3_data_verify ...
    ...

    Update log recovery to update the metadata LSN of recovered buffers.
    Since metadata LSNs are already updated by write verifer functions via
    attached log items, attach a dummy log item to the buffer during
    validation and explicitly set the LSN of the current transaction. This
    ensures that the metadata LSN of a buffer is updated based on whether
    the recovery I/O actually completes, and if so, that subsequent recovery
    attempts identify that the buffer is already up to date with respect to
    the current transaction.

    Signed-off-by: Brian Foster
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Brian Foster
     
  • The log recovery buffer validation function is invoked in cases where a
    buffer update may be skipped due to LSN ordering. If the validation
    function happens to come across directory conversion situations (e.g., a
    dir3 block to data conversion), it may warn about seeing a buffer log
    format of one type and a buffer with a magic number of another.

    This warning is not valid as the buffer update is ultimately skipped.
    This is indicated by a current_lsn of NULLCOMMITLSN provided by the
    caller. As such, update xlog_recover_validate_buf_type() to only warn in
    such cases when a buffer update is expected.

    Signed-off-by: Brian Foster
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Brian Foster
     
  • The current LSN must be available to the buffer validation function to
    provide the ability to update the metadata LSN of the buffer. Pass the
    current_lsn value down to xlog_recover_validate_buf_type() in
    preparation.

    Signed-off-by: Brian Foster
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Brian Foster
     
  • The fix to log recovery to update the metadata LSN in recovered buffers
    introduces the requirement that a buffer is submitted only once per
    current LSN. Log recovery currently submits buffers on transaction
    boundaries. This is not sufficient as the abstraction between log
    records and transactions allows for various scenarios where multiple
    transactions can share the same current LSN. If independent transactions
    share an LSN and both modify the same buffer, log recovery can
    incorrectly skip updates and leave the filesystem in an inconsisent
    state.

    In preparation for proper metadata LSN updates during log recovery,
    update log recovery to submit buffers for write on LSN change boundaries
    rather than transaction boundaries. Explicitly track the current LSN in
    a new struct xlog field to handle the various corner cases of when the
    current LSN may or may not change.

    Signed-off-by: Brian Foster
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Brian Foster
     

03 Aug, 2016

6 commits

  • Nothing ever uses the extent array in the rmap update done redo
    item, so remove it before it is fixed in the on-disk log format.

    Signed-off-by: Darrick J. Wong
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Darrick J. Wong
     
  • Originally-From: Dave Chinner

    So such blocks can be correctly identified and have their operations
    structures attached to validate recovery has not resulted in a
    correct block.

    Signed-off-by: Dave Chinner
    Reviewed-by: Brian Foster
    Signed-off-by: Dave Chinner

    Darrick J. Wong
     
  • Provide a mechanism for higher levels to create RUI/RUD items, submit
    them to the log, and a stub function to deal with recovered RUI items.
    These parts will be connected to the rmapbt in a later patch.

    Signed-off-by: Darrick J. Wong
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Darrick J. Wong
     
  • Originally-From: Dave Chinner

    The rmap btree is allocated from the AGFL, which means we have to
    ensure ENOSPC is reported to userspace before we run out of free
    space in each AG. The last allocation in an AG can cause a full
    height rmap btree split, and that means we have to reserve at least
    this many blocks *in each AG* to be placed on the AGFL at ENOSPC.
    Update the various space calculation functions to handle this.

    Also, because the macros are now executing conditional code and are
    called quite frequently, convert them to functions that initialise
    variables in the struct xfs_mount, use the new variables everywhere
    and document the calculations better.

    [darrick.wong@oracle.com: don't reserve blocks if !rmap]
    [dchinner@redhat.com: update m_ag_max_usable after growfs]

    Signed-off-by: Dave Chinner
    Signed-off-by: Darrick J. Wong
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Darrick J. Wong
     
  • Refactor the EFI intent item recovery (and cancellation) functions
    into a general function that scans the AIL and an intent item type
    specific handler. Move the function that recovers a single EFI item
    into the extent free item code. We'll want the generalized function
    when we start wiring up more redo item types.

    Furthermore, ensure that log recovery only replays the redo items
    that were in the AIL prior to recovery by checking the item LSN
    against the largest LSN seen during log scanning. As written this
    should never happen, but we can be defensive anyway.

    Signed-off-by: Darrick J. Wong
    Reviewed-by: Brian Foster
    Signed-off-by: Dave Chinner

    Darrick J. Wong
     
  • Restructure everything that used xfs_bmap_free to use xfs_defer_ops
    instead. For now we'll just remove the old symbols and play some
    cpp magic to make it work; in the next patch we'll actually rename
    everything.

    Signed-off-by: Darrick J. Wong
    Reviewed-by: Brian Foster
    Signed-off-by: Dave Chinner

    Darrick J. Wong
     

20 May, 2016

1 commit


06 Apr, 2016

2 commits

  • Use krealloc to implement our realloc function. This helps to avoid
    new allocations if we are still in the slab bucket. At least for the
    bmap btree root that's actually the common case.

    This also allows removing the now unused oldsize argument.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Brian Foster
    Signed-off-by: Dave Chinner

    Christoph Hellwig
     
  • Merge xfs_trans_reserve and xfs_trans_alloc into a single function call
    that returns a transaction with all the required log and block reservations,
    and which allows passing transaction flags directly to avoid the cumbersome
    _xfs_trans_alloc interface.

    While we're at it we also get rid of the transaction type argument that has
    been superflous since we stopped supporting the non-CIL logging mode. The
    guts of it will be removed in another patch.

    [dchinner: fixed transaction leak in error path in xfs_setattr_nonsize]

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner

    Christoph Hellwig
     

09 Mar, 2016

1 commit


07 Mar, 2016

3 commits