12 Oct, 2011

4 commits

  • There is no reason to keep a reference to the inode even if we unlock
    it during transaction commit because we never drop a reference between
    the ijoin and commit. Also use this fact to merge xfs_trans_ijoin_ref
    back into xfs_trans_ijoin - the third argument decides if an unlock
    is needed now.

    I'm actually starting to wonder if allowing inodes to be unlocked
    at transaction commit really is worth the effort. The only real
    benefit is that they can be unlocked earlier when commiting a
    synchronous transactions, but that could be solved by doing the
    log force manually after the unlock, too.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig
     
  • Now that all the read-only users of xfs_bmapi have been converted to
    use xfs_bmapi_read(), we can remove all the read-only handling cases
    from xfs_bmapi().

    Once this is done, rename xfs_bmapi to xfs_bmapi_write to reflect
    the fact it is for allocation only. This enables us to kill the
    XFS_BMAPI_WRITE flag as well.

    Also clean up xfs_bmapi_write to the style used in the newly added
    xfs_bmapi_read/delay functions.

    Signed-off-by: Dave Chinner
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Dave Chinner
     
  • Delalloc reservations are much simpler than allocations, so give
    them a separate bmapi-level interface. Using the previously added
    xfs_bmapi_reserve_delalloc we get a function that is only minimally
    more complicated than xfs_bmapi_read, which is far from the complexity
    in xfs_bmapi. Also remove the XFS_BMAPI_DELAY code after switching
    over the only user to xfs_bmapi_delay.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig
     
  • xfs_bmapi() currently handles both extent map reading and
    allocation. As a result, the code is littered with "if (wr)"
    branches to conditionally do allocation operations if required.
    This makes the code much harder to follow and causes significant
    indent issues with the code.

    Given that read mapping is much simpler than allocation, we can
    split out read mapping from xfs_bmapi() and reuse the logic that
    we have already factored out do do all the hard work of handling the
    extent map manipulations. The results in a much simpler function for
    the common extent read operations, and will allow the allocation
    code to be simplified in another commit.

    Once xfs_bmapi_read() is implemented, convert all the callers of
    xfs_bmapi() that are only reading extents to use the new function.

    Signed-off-by: Dave Chinner
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Dave Chinner
     

07 Mar, 2011

2 commits


28 Jan, 2011

1 commit

  • rounddown_power_of_2() returns an undefined result when passed a
    value of zero. The specualtive delayed allocation code is doing this
    when the inode is zero length. Hence occasionally the preallocation
    is much, much larger than is necessary (e.g. 8GB for a 270 _byte_
    file). Ensure we don't even pass a zero value to this function so
    the result of preallocation is always the desired size.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Alex Elder

    Dave Chinner
     

04 Jan, 2011

1 commit

  • Currently the size of the speculative preallocation during delayed
    allocation is fixed by either the allocsize mount option of a
    default size. We are seeing a lot of cases where we need to
    recommend using the allocsize mount option to prevent fragmentation
    when buffered writes land in the same AG.

    Rather than using a fixed preallocation size by default (up to 64k),
    make it dynamic by basing it on the current inode size. That way the
    EOF preallocation will increase as the file size increases. Hence
    for streaming writes we are much more likely to get large
    preallocations exactly when we need it to reduce fragementation.

    For default settings, the size of the initial extents is determined
    by the number of parallel writers and the amount of memory in the
    machine. For 4GB RAM and 4 concurrent 32GB file writes:

    EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
    0: [0..1048575]: 1048672..2097247 0 (1048672..2097247) 1048576
    1: [1048576..2097151]: 5242976..6291551 0 (5242976..6291551) 1048576
    2: [2097152..4194303]: 12583008..14680159 0 (12583008..14680159) 2097152
    3: [4194304..8388607]: 25165920..29360223 0 (25165920..29360223) 4194304
    4: [8388608..16777215]: 58720352..67108959 0 (58720352..67108959) 8388608
    5: [16777216..33554423]: 117440584..134217791 0 (117440584..134217791) 16777208
    6: [33554424..50331511]: 184549056..201326143 0 (184549056..201326143) 16777088
    7: [50331512..67108599]: 251657408..268434495 0 (251657408..268434495) 16777088

    and for 16 concurrent 16GB file writes:

    EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
    0: [0..262143]: 2490472..2752615 0 (2490472..2752615) 262144
    1: [262144..524287]: 6291560..6553703 0 (6291560..6553703) 262144
    2: [524288..1048575]: 13631592..14155879 0 (13631592..14155879) 524288
    3: [1048576..2097151]: 30408808..31457383 0 (30408808..31457383) 1048576
    4: [2097152..4194303]: 52428904..54526055 0 (52428904..54526055) 2097152
    5: [4194304..8388607]: 104857704..109052007 0 (104857704..109052007) 4194304
    6: [8388608..16777215]: 209715304..218103911 0 (209715304..218103911) 8388608
    7: [16777216..33554423]: 452984848..469762055 0 (452984848..469762055) 16777208

    Because it is hard to take back specualtive preallocation, cases
    where there are large slow growing log files on a nearly full
    filesystem may cause premature ENOSPC. Hence as the filesystem nears
    full, the maximum dynamic prealloc size іs reduced according to this
    table (based on 4k block size):

    freespace max prealloc size
    >5% full extent (8GB)
    4-5% 2GB (8GB >> 2)
    3-4% 1GB (8GB >> 3)
    2-3% 512MB (8GB >> 4)
    1-2% 256MB (8GB >> 5)
    > 6)

    This should reduce the amount of space held in speculative
    preallocation for such cases.

    The allocsize mount option turns off the dynamic behaviour and fixes
    the prealloc size to whatever the mount option specifies. i.e. the
    behaviour is unchanged.

    Signed-off-by: Dave Chinner

    Dave Chinner
     

17 Dec, 2010

2 commits

  • Opencode the xfs_iomap code in it's two callers. The overlap of
    passed flags already was minimal and will be further reduced in the
    next patch.

    As a side effect the BMAPI_* flags for xfs_bmapi and the IO_* flags
    for I/O end processing are merged into a single set of flags, which
    should be a bit more descriptive of the operation we perform.

    Also improve the tracing by giving each caller it's own type set of
    tracepoints.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Alex Elder

    Christoph Hellwig
     
  • Remove passing the BMAPI_* flags to these helpers, in
    xfs_iomap_write_direct the check BMAPI_DIRECT was always true, and
    in the xfs_iomap_write_delay path is was never checked at all.
    Remove the nmap return value as we never make use of it.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

27 Jul, 2010

6 commits

  • Remove the flags argument to __xfs_get_blocks as we can easily derive
    it from the direct argument, and remove the unused BMAPI_MMAP flag.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     
  • xfs_iomap passes a xfs_bmbt_irec pointer to xfs_iomap_write_direct and
    xfs_iomap_write_allocate to give them the results of our read-only
    xfs_bmapi query. Instead of allocating a new xfs_bmbt_irec on stack
    for the next call to xfs_bmapi re use the one we got passed as it's not
    used after this point.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     
  • This code was introduced four years ago in commit
    3e57ecf640428c01ba1ed8c8fc538447ada1715b without any review and has
    been unused since. Remove it just as the rest of the code introduced
    in that commit to reduce that stack usage and complexity in this central
    piece of code.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     
  • Currently we need to either call IHOLD or xfs_trans_ihold on an inode when
    joining it to a transaction via xfs_trans_ijoin.

    This patches instead makes xfs_trans_ijoin usable on it's own by doing
    an implicity xfs_trans_ihold, which also allows us to drop the third
    argument. For the case where we want to hold a reference on the inode
    a xfs_trans_ijoin_ref wrapper is added which does the IHOLD and marks
    the inode for needing an xfs_iput. In addition to the cleaner interface
    to the caller this also simplifies the implementation.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     
  • Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     
  • Dmapi support was never merged upstream, but we still have a lot of hooks
    bloating XFS for it, all over the fast pathes of the filesystem.

    This patch drops over 700 lines of dmapi overhead. If we'll ever get HSM
    support in mainline at least the namespace events can be done much saner
    in the VFS instead of the individual filesystem, so it's not like this
    is much help for future work.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Dave Chinner

    Christoph Hellwig
     

19 May, 2010

7 commits

  • And also drop a useless argument to xfs_iomap_write_direct.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig
     
  • Now that struct xfs_iomap contains exactly the same units as struct
    xfs_bmbt_irec we can just use the latter directly in the aops code.
    Replace the missing IOMAP_NEW flag with a new boolean output
    parameter to xfs_iomap.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig
     
  • Report the iomap_bn field of struct xfs_iomap in terms of filesystem
    blocks instead of in terms of bytes. Shift the byte conversions
    into the caller, and replace the IOMAP_DELAY and IOMAP_HOLE flag
    checks with checks for HOLESTARTBLOCK and DELAYSTARTBLOCK.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig
     
  • Report the iomap_offset and iomap_bsize fields of struct xfs_iomap
    in terms of fsblocks instead of in terms of disk blocks. Shift the
    byte conversions into the callers temporarily, but they will
    disappear or get cleaned up later.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig
     
  • The iomap_delta field in struct xfs_iomap just contains the
    difference between the offset passed to xfs_iomap and the
    iomap_offset. Just calculate it in the only caller that cares.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig
     
  • Instead of using the iomap_target field in struct xfs_iomap
    and the IOMAP_REALTIME flag just use the already existing
    xfs_find_bdev_for_inode helper. There's some fallout as we
    need to pass the inode in a few more places, which we also
    use to sanitize some calling conventions.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig
     
  • We only call xfs_iomap for single mappings anyway, so remove all
    code dealing with multiple mappings from xfs_imap_to_bmap and add
    asserts that we never get results that we do not expect.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

15 Dec, 2009

1 commit

  • Convert the old xfs tracing support that could only be used with the
    out of tree kdb and xfsidbg patches to use the generic event tracer.

    To use it make sure CONFIG_EVENT_TRACING is enabled and then enable
    all xfs trace channels by:

    echo 1 > /sys/kernel/debug/tracing/events/xfs/enable

    or alternatively enable single events by just doing the same in one
    event subdirectory, e.g.

    echo 1 > /sys/kernel/debug/tracing/events/xfs/xfs_ihold/enable

    or set more complex filters, etc. In Documentation/trace/events.txt
    all this is desctribed in more detail. To reads the events do a

    cat /sys/kernel/debug/tracing/trace

    Compared to the last posting this patch converts the tracing mostly to
    the one tracepoint per callsite model that other users of the new
    tracing facility also employ. This allows a very fine-grained control
    of the tracing, a cleaner output of the traces and also enables the
    perf tool to use each tracepoint as a virtual performance counter,
    allowing us to e.g. count how often certain workloads git various
    spots in XFS. Take a look at

    http://lwn.net/Articles/346470/

    for some examples.

    Also the btree tracing isn't included at all yet, as it will require
    additional core tracing features not in mainline yet, I plan to
    deliver it later.

    And the really nice thing about this patch is that it actually removes
    many lines of code while adding this nice functionality:

    fs/xfs/Makefile | 8
    fs/xfs/linux-2.6/xfs_acl.c | 1
    fs/xfs/linux-2.6/xfs_aops.c | 52 -
    fs/xfs/linux-2.6/xfs_aops.h | 2
    fs/xfs/linux-2.6/xfs_buf.c | 117 +--
    fs/xfs/linux-2.6/xfs_buf.h | 33
    fs/xfs/linux-2.6/xfs_fs_subr.c | 3
    fs/xfs/linux-2.6/xfs_ioctl.c | 1
    fs/xfs/linux-2.6/xfs_ioctl32.c | 1
    fs/xfs/linux-2.6/xfs_iops.c | 1
    fs/xfs/linux-2.6/xfs_linux.h | 1
    fs/xfs/linux-2.6/xfs_lrw.c | 87 --
    fs/xfs/linux-2.6/xfs_lrw.h | 45 -
    fs/xfs/linux-2.6/xfs_super.c | 104 ---
    fs/xfs/linux-2.6/xfs_super.h | 7
    fs/xfs/linux-2.6/xfs_sync.c | 1
    fs/xfs/linux-2.6/xfs_trace.c | 75 ++
    fs/xfs/linux-2.6/xfs_trace.h | 1369 +++++++++++++++++++++++++++++++++++++++++
    fs/xfs/linux-2.6/xfs_vnode.h | 4
    fs/xfs/quota/xfs_dquot.c | 110 ---
    fs/xfs/quota/xfs_dquot.h | 21
    fs/xfs/quota/xfs_qm.c | 40 -
    fs/xfs/quota/xfs_qm_syscalls.c | 4
    fs/xfs/support/ktrace.c | 323 ---------
    fs/xfs/support/ktrace.h | 85 --
    fs/xfs/xfs.h | 16
    fs/xfs/xfs_ag.h | 14
    fs/xfs/xfs_alloc.c | 230 +-----
    fs/xfs/xfs_alloc.h | 27
    fs/xfs/xfs_alloc_btree.c | 1
    fs/xfs/xfs_attr.c | 107 ---
    fs/xfs/xfs_attr.h | 10
    fs/xfs/xfs_attr_leaf.c | 14
    fs/xfs/xfs_attr_sf.h | 40 -
    fs/xfs/xfs_bmap.c | 507 +++------------
    fs/xfs/xfs_bmap.h | 49 -
    fs/xfs/xfs_bmap_btree.c | 6
    fs/xfs/xfs_btree.c | 5
    fs/xfs/xfs_btree_trace.h | 17
    fs/xfs/xfs_buf_item.c | 87 --
    fs/xfs/xfs_buf_item.h | 20
    fs/xfs/xfs_da_btree.c | 3
    fs/xfs/xfs_da_btree.h | 7
    fs/xfs/xfs_dfrag.c | 2
    fs/xfs/xfs_dir2.c | 8
    fs/xfs/xfs_dir2_block.c | 20
    fs/xfs/xfs_dir2_leaf.c | 21
    fs/xfs/xfs_dir2_node.c | 27
    fs/xfs/xfs_dir2_sf.c | 26
    fs/xfs/xfs_dir2_trace.c | 216 ------
    fs/xfs/xfs_dir2_trace.h | 72 --
    fs/xfs/xfs_filestream.c | 8
    fs/xfs/xfs_fsops.c | 2
    fs/xfs/xfs_iget.c | 111 ---
    fs/xfs/xfs_inode.c | 67 --
    fs/xfs/xfs_inode.h | 76 --
    fs/xfs/xfs_inode_item.c | 5
    fs/xfs/xfs_iomap.c | 85 --
    fs/xfs/xfs_iomap.h | 8
    fs/xfs/xfs_log.c | 181 +----
    fs/xfs/xfs_log_priv.h | 20
    fs/xfs/xfs_log_recover.c | 1
    fs/xfs/xfs_mount.c | 2
    fs/xfs/xfs_quota.h | 8
    fs/xfs/xfs_rename.c | 1
    fs/xfs/xfs_rtalloc.c | 1
    fs/xfs/xfs_rw.c | 3
    fs/xfs/xfs_trans.h | 47 +
    fs/xfs/xfs_trans_buf.c | 62 -
    fs/xfs/xfs_vnodeops.c | 8
    70 files changed, 2151 insertions(+), 2592 deletions(-)

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

12 Dec, 2009

1 commit

  • When completing I/O requests we must not allow the memory allocator to
    recurse into the filesystem, as we might deadlock on waiting for the
    I/O completion otherwise. The only thing currently allocating normal
    GFP_KERNEL memory is the allocation of the transaction structure for
    the unwritten extent conversion. Add a memflags argument to
    _xfs_trans_alloc to allow controlling the allocator behaviour.

    Signed-off-by: Christoph Hellwig
    Reported-by: Thomas Neumann
    Tested-by: Thomas Neumann
    Reviewed-by: Alex Elder
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

10 Jun, 2009

1 commit

  • This patch rips out the XFS ACL handling code and uses the generic
    fs/posix_acl.c code instead. The ondisk format is of course left
    unchanged.

    This also introduces the same ACL caching all other Linux filesystems do
    by adding pointers to the acl and default acl in struct xfs_inode.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Eric Sandeen

    Christoph Hellwig
     

08 Jun, 2009

1 commit

  • Kill the quota ops function vector and replace it with direct calls or
    stubs in the CONFIG_XFS_QUOTA=n case.

    Make sure we check XFS_IS_QUOTA_RUNNING in the right spots. We can remove
    the number of those checks because the XFS_TRANS_DQ_DIRTY flag can't be set
    otherwise.

    This brings us back closer to the way this code worked in IRIX and earlier
    Linux versions, but we keep a lot of the more useful factoring of common
    code.

    Eventually we should also kill xfs_qm_bhv.c, but that's left for a later
    patch.

    Reduces the size of the source code by about 250 lines and the size of
    XFS module by about 1.5 kilobytes with quotas enabled:

    text data bss dec hex filename
    615957 2960 3848 622765 980ad fs/xfs/xfs.o
    617231 3152 3848 624231 98667 fs/xfs/xfs.o.old

    Fallout:

    - xfs_qm_dqattach is split into xfs_qm_dqattach_locked which expects
    the inode locked and xfs_qm_dqattach which does the locking around it,
    thus removing XFS_QMOPT_ILOCKED.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Eric Sandeen

    Christoph Hellwig
     

07 Apr, 2009

3 commits

  • The only thing we need to do now when we get an ENOSPC condition during delayed
    allocation reservation is flush all the other inodes with delalloc blocks on
    them and retry without EOF preallocation. Remove the unneeded mess that is
    xfs_flush_space() and just call xfs_flush_inodes() directly from
    xfs_iomap_write_delay().

    Also, change the location of the retry label to avoid trying to do EOF
    preallocation because we don't want to do that at ENOSPC. This enables us to
    remove the BMAPI_SYNC flag as it is no longer used.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     
  • When we are writing to a single file and hit ENOSPC, we trigger a background
    flush of the inode and try again. Because we hold page locks and the iolock,
    the flush won't proceed until after we release these locks. This occurs once
    we've given up and ENOSPC has been reported. Hence if this one is the only
    dirty inode in the system, we'll get an ENOSPC prematurely.

    To fix this, remove the async flush from the allocation routines and move
    it to the top of the write path where we can do a synchronous flush
    and retry the write again. Only retry once as a second ENOSPC indicates
    that we really are ENOSPC.

    This avoids a page cache deadlock when trying to do this flush synchronously
    in the allocation layer that was identified by Mikulas Patocka.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     
  • Currently xfs_device_flush calls sync_blockdev() which is
    a no-op for XFS as all it's metadata is held in a different
    address to the one sync_blockdev() works on.

    Call xfs_sync_inodes() instead to flush all the delayed
    allocation blocks out. To do this as efficiently as possible,
    do it via two passes - one to do an async flush of all the
    dirty blocks and a second to wait for all the IO to complete.
    This requires some modification to the xfs-sync_inodes_ag()
    flush code to do efficiently.

    Signed-off-by: Dave Chinner
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     

16 Jan, 2009

1 commit


22 Dec, 2008

1 commit

  • Speculative allocation beyond eof doesn't work properly. It was
    broken some time ago after a code cleanup that moved what is now
    xfs_iomap_eof_align_last_fsb() and xfs_iomap_eof_want_preallocate()
    out of xfs_iomap_write_delay() into separate functions. The code
    used to use the current file size in various checks but got changed
    to be max(file_size, i_new_size). Since i_new_size is the result
    of 'offset + count' then in xfs_iomap_eof_want_preallocate() the
    check for '(offset + count)
    Signed-off-by: Lachlan McIlroy

    Lachlan McIlroy
     

28 Jul, 2008

1 commit

  • The bmap btree split code relies on a previous data extent allocation
    (from xfs_bmap_btalloc()) to find an AG that has sufficient space to
    perform a full btree split, when inserting the extent. When converting
    unwritten extents we don't allocate a data extent so a btree split will be
    the first allocation. In this case we need to set minleft so the allocator
    will pick an AG that has space to complete the split(s).

    SGI-PV: 983338

    SGI-Modid: xfs-linux-melb:xfs-kern:31357a

    Signed-off-by: Lachlan McIlroy
    Signed-off-by: David Chinner

    Lachlan McIlroy
     

29 Apr, 2008

2 commits

  • The check for block zero access should be done on non-realtime inodes. Fix
    the logic error in xfs_write_iomap_allocate(), and simplify the logic on
    all checks for block zero access in xfs_iomap.c

    SGI-PV: 980888
    SGI-Modid: xfs-linux-melb:xfs-kern:30998a

    Signed-off-by: David Chinner
    Signed-off-by: Lachlan McIlroy

    David Chinner
     
  • The writer field is not needed for non_DEBU builds so remove it. While
    we're at i also clean up the interface for is locked asserts to go through
    and xfs_iget.c helper with an interface like the xfs_ilock routines to
    isolated the XFS codebase from mrlock internals. That way we can kill
    mrlock_t entirely once rw_semaphores grow an islocked facility. Also
    remove unused flags to the ilock family of functions.

    SGI-PV: 976035
    SGI-Modid: xfs-linux-melb:xfs-kern:30902a

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Lachlan McIlroy

    Christoph Hellwig
     

18 Apr, 2008

1 commit


07 Feb, 2008

4 commits

  • Use XFS_IS_REALTIME_INODE in more places, and #define it to 0 if
    CONFIG_XFS_RT is off. This should be safe because mount checks in
    xfs_rtmount_init:

    so if we get mounted w/o CONFIG_XFS_RT, no realtime inodes should be
    encountered after that.

    Defining XFS_IS_REALTIME_INODE to 0 saves a bit of stack space,
    presumeably gcc can optimize around the various "if (0)" type checks:

    xfs_alloc_file_space -8 xfs_bmap_adjacent -16 xfs_bmapi -8
    xfs_bmap_rtalloc -16 xfs_bunmapi -28 xfs_free_file_space -64 xfs_imap +8

    Signed-off-by: David Chinner
    Signed-off-by: Lachlan McIlroy

    Eric Sandeen
     
  • Prevent transaction overrun in xfs_iomap_write_allocate() if we race with
    a truncate that overlaps the delalloc range we were planning to allocate.

    If we race, we may allocate into a hole and that requires block
    allocation. At this point in time we don't have a reservation for block
    allocation (apart from metadata blocks) and so allocating into a hole
    rather than a delalloc region results in overflowing the transaction block
    reservation.

    Fix it by only allowing a single extent to be allocated at a time.

    SGI-PV: 972757
    SGI-Modid: xfs-linux-melb:xfs-kern:30005a

    Signed-off-by: David Chinner
    Signed-off-by: Lachlan McIlroy

    David Chinner
     
  • xfs_iocore_t is a structure embedded in xfs_inode. Except for one field it
    just duplicates fields already in xfs_inode, and there is nothing this
    abstraction buys us on XFS/Linux. This patch removes it and shrinks source
    and binary size of xfs aswell as shrinking the size of xfs_inode by 60/44
    bytes in debug/non-debug builds.

    SGI-PV: 970852
    SGI-Modid: xfs-linux-melb:xfs-kern:29754a

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Lachlan McIlroy
    Signed-off-by: Tim Shimmin

    Christoph Hellwig
     
  • Currently there is an indirection called ioops in the XFS data I/O path.
    Various functions are called by functions pointers, but there is no
    coherence in what this is for, and of course for XFS itself it's entirely
    unused. This patch removes it instead and significantly reduces source and
    binary size of XFS while making maintaince easier.

    SGI-PV: 970841
    SGI-Modid: xfs-linux-melb:xfs-kern:29737a

    Signed-off-by: Lachlan McIlroy
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Tim Shimmin

    Lachlan McIlroy