Eric Lee / smarc-fsl-linux-kernel

16 Sep, 2020

1 commit

b3f8e08ca xfs: remove xfs_getsb ... Browse Code »

Merge xfs_getsb into its only caller, and clean that one up a little bit
as well.

Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Christoph Hellwig
2020-09-16 11:52:39 +0800

07 Jul, 2020

1 commit

0e8e2c634 xfs: allow multiple reclaimers per AG ... Browse Code »

Inode reclaim will still throttle direct reclaim on the per-ag
reclaim locks. This is no longer necessary as reclaim can run
non-blocking now. Hence we can remove these locks so that we don't
arbitrarily block reclaimers just because there are more direct
reclaimers than there are AGs.

This can result in multiple reclaimers working on the same range of
an AG, but this doesn't cause any apparent issues. Optimising the
spread of concurrent reclaimers for best efficiency can be done in a
future patchset.

Signed-off-by: Dave Chinner
Reviewed-by: Darrick J. Wong
Reviewed-by: Brian Foster
Signed-off-by: Darrick J. Wong

Dave Chinner
2020-07-07 22:15:07 +0800

27 May, 2020

3 commits

b41b46c20 xfs: remove the m_active_trans counter ... Browse Code »

It's a global atomic counter, and we are hitting it at a rate of
half a million transactions a second, so it's bouncing the counter
cacheline all over the place on large machines. We don't actually
need it anymore - it used to be required because the VFS freeze code
could not track/prevent filesystem transactions that were running,
but that problem no longer exists.

Hence to remove the counter, we simply have to ensure that nothing
calls xfs_sync_sb() while we are trying to quiesce the filesytem.
That only happens if the log worker is still running when we call
xfs_quiesce_attr(). The log worker is cancelled at the end of
xfs_quiesce_attr() by calling xfs_log_quiesce(), so just call it
early here and then we can remove the counter altogether.

Concurrent create, 50 million inodes, identical 16p/16GB virtual
machines on different physical hosts. Machine A has twice the CPU
cores per socket of machine B:

unpatched patched
machine A: 3m16s 2m00s
machine B: 4m04s 4m05s

Create rates:
unpatched patched
machine A: 282k+/-31k 468k+/-21k
machine B: 231k+/-8k 233k+/-11k

Concurrent rm of same 50 million inodes:

unpatched patched
machine A: 6m42s 2m33s
machine B: 4m47s 4m47s

The transaction rate on the fast machine went from just under
300k/sec to 700k/sec, which indicates just how much of a bottleneck
this atomic counter was.

Signed-off-by: Dave Chinner
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Dave Chinner
2020-05-27 23:49:25 +0800
b0dff466c xfs: separate read-only variables in struct xfs_mount ... Browse Code »

Seeing massive cpu usage from xfs_agino_range() on one machine;
instruction level profiles look similar to another machine running
the same workload, only one machine is consuming 10x as much CPU as
the other and going much slower. The only real difference between
the two machines is core count per socket. Both are running
identical 16p/16GB virtual machine configurations

Machine A:

25.83% [k] xfs_agino_range
12.68% [k] __xfs_dir3_data_check
6.95% [k] xfs_verify_ino
6.78% [k] xfs_dir2_data_entry_tag_p
3.56% [k] xfs_buf_find
2.31% [k] xfs_verify_dir_ino
2.02% [k] xfs_dabuf_map.constprop.0
1.65% [k] xfs_ag_block_count

And takes around 13 minutes to remove 50 million inodes.

Machine B:

13.90% [k] __pv_queued_spin_lock_slowpath
3.76% [k] do_raw_spin_lock
2.83% [k] xfs_dir3_leaf_check_int
2.75% [k] xfs_agino_range
2.51% [k] __raw_callee_save___pv_queued_spin_unlock
2.18% [k] __xfs_dir3_data_check
2.02% [k] xfs_log_commit_cil

And takes around 5m30s to remove 50 million inodes.

Suspect is cacheline contention on m_sectbb_log which is used in one
of the macros in xfs_agino_range. This is a read-only variable but
shares a cacheline with m_active_trans which is a global atomic that
gets bounced all around the machine.

The workload is trying to run hundreds of thousands of transactions
per second and hence cacheline contention will be occurring on this
atomic counter. Hence xfs_agino_range() is likely just be an
innocent bystander as the cache coherency protocol fights over the
cacheline between CPU cores and sockets.

On machine A, this rearrangement of the struct xfs_mount
results in the profile changing to:

9.77% [kernel] [k] xfs_agino_range
6.27% [kernel] [k] __xfs_dir3_data_check
5.31% [kernel] [k] __pv_queued_spin_lock_slowpath
4.54% [kernel] [k] xfs_buf_find
3.79% [kernel] [k] do_raw_spin_lock
3.39% [kernel] [k] xfs_verify_ino
2.73% [kernel] [k] __raw_callee_save___pv_queued_spin_unlock

Vastly less CPU usage in xfs_agino_range(), but still 3x the amount
of machine B and still runs substantially slower than it should.

Current rm -rf of 50 million files:

vanilla patched
machine A 13m20s 6m42s
machine B 5m30s 5m02s

It's an improvement, hence indicating that separation and further
optimisation of read-only global filesystem data is worthwhile, but
it clearly isn't the underlying issue causing this specific
performance degradation.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Dave Chinner
2020-05-27 23:49:25 +0800
f18c9a903 xfs: reduce free inode accounting overhead ... Browse Code »

Shaokun Zhang reported that XFS was using substantial CPU time in
percpu_count_sum() when running a single threaded benchmark on
a high CPU count (128p) machine from xfs_mod_ifree(). The issue
is that the filesystem is empty when the benchmark runs, so inode
allocation is running with a very low inode free count.

With the percpu counter batching, this means comparisons when the
counter is less that 128 * 256 = 32768 use the slow path of adding
up all the counters across the CPUs, and this is expensive on high
CPU count machines.

The summing in xfs_mod_ifree() is only used to fire an assert if an
underrun occurs. The error is ignored by the higher level code.
Hence this is really just debug code and we don't need to run it
on production kernels, nor do we need such debug checks to return
error values just to trigger an assert.

Finally, xfs_mod_icount/xfs_mod_ifree are only called from
xfs_trans_unreserve_and_mod_sb(), so get rid of them and just
directly call the percpu_counter_add/percpu_counter_compare
functions. The compare functions are now run only on debug builds as
they are internal to ASSERT() checks and so only compiled in when
ASSERTs are active (CONFIG_XFS_DEBUG=y or CONFIG_XFS_WARN=y).

Reported-by: Shaokun Zhang
Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Dave Chinner
2020-05-27 23:49:25 +0800

07 May, 2020

1 commit

28d846207 xfs: remove unused shutdown types ... Browse Code »

Both types control shutdown messaging and neither is used in the
current codebase.

Signed-off-by: Brian Foster
Reviewed-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Reviewed-by: Allison Collins
Signed-off-by: Darrick J. Wong

Brian Foster
2020-05-07 23:27:48 +0800

05 May, 2020

2 commits

8d6c3446e fs/xfs: Make DAX mount option a tri-state ... Browse Code »

As agreed upon[1]. We make the dax mount option a tri-state. '-o dax'
continues to operate the same. We add 'always', 'never', and 'inode'
(default).

[1] https://lore.kernel.org/lkml/20200405061945.GA94792@iweiny-DESK2.sc.intel.com/

Signed-off-by: Ira Weiny
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Ira Weiny
2020-05-05 00:03:43 +0800
606723d98 fs/xfs: Change XFS_MOUNT_DAX to XFS_MOUNT_DAX_ALWAYS ... Browse Code »

In prep for the new tri-state mount option which then introduces
XFS_MOUNT_DAX_NEVER.

Reviewed-by: Dave Chinner
Reviewed-by: Darrick J. Wong
Signed-off-by: Ira Weiny
Signed-off-by: Darrick J. Wong

Ira Weiny
2020-05-05 00:03:43 +0800

17 Apr, 2020

1 commit

f0f7a674d xfs: move inode flush to the sync workqueue ... Browse Code »

Move the inode dirty data flushing to a workqueue so that multiple
threads can take advantage of a single thread's flushing work. The
ratelimiting technique used in bdd4ee4 was not successful, because
threads that skipped the inode flush scan due to ratelimiting would
ENOSPC early, which caused occasional (but noticeable) changes in
behavior and sporadic fstest regressions.

Therefore, make all the writer threads wait on a single inode flush,
which eliminates both the stampeding hordes of flushers and the small
window in which a write could fail with ENOSPC because it lost the
ratelimit race after even another thread freed space.

Fixes: c6425702f21e ("xfs: ratelimit inode flush on buffered write ENOSPC")
Signed-off-by: Darrick J. Wong
Reviewed-by: Brian Foster

Darrick J. Wong
2020-04-17 00:07:42 +0800

31 Mar, 2020

1 commit

c6425702f xfs: ratelimit inode flush on buffered write ENOSPC ... Browse Code »

A customer reported rcu stalls and softlockup warnings on a computer
with many CPU cores and many many more IO threads trying to write to a
filesystem that is totally out of space. Subsequent analysis pointed to
the many many IO threads calling xfs_flush_inodes -> sync_inodes_sb,
which causes a lot of wb_writeback_work to be queued. The writeback
worker spends so much time trying to wake the many many threads waiting
for writeback completion that it trips the softlockup detector, and (in
this case) the system automatically reboots.

In addition, they complain that the lengthy xfs_flush_inodes scan traps
all of those threads in uninterruptible sleep, which hampers their
ability to kill the program or do anything else to escape the situation.

If there's thousands of threads trying to write to files on a full
filesystem, each of those threads will start separate copies of the
inode flush scan. This is kind of pointless since we only need one
scan, so rate limit the inode flush.

Signed-off-by: Darrick J. Wong
Reviewed-by: Dave Chinner

Darrick J. Wong
2020-03-31 23:41:45 +0800

14 Nov, 2019

3 commits

a55cefcca xfs: remove unused structure members & simple typedefs ... Browse Code »

Remove some unused typedef'd simple types, and some unused
structure members.

Signed-off-by: Eric Sandeen
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Eric Sandeen
2019-11-14 10:22:41 +0800
d8d11fc70 xfs: devirtualize ->m_dirnameops ... Browse Code »

Instead of causing a relatively expensive indirect call for each
hashing and comparism of a file name in a directory just use an
inline function and a simple branch on the ASCII CI bit.

Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
[darrick: fix unused variable warning]
Signed-off-by: Darrick J. Wong

Christoph Hellwig
2019-11-14 03:13:45 +0800
537dabcfd xfs: remove the unused m_chsize field ... Browse Code »

Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Christoph Hellwig
2019-11-14 03:13:45 +0800

11 Nov, 2019

2 commits

957ee13e2 xfs: remove the now unused dir ops infrastructure ... Browse Code »

Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Christoph Hellwig
2019-11-11 08:54:24 +0800
3b3444130 xfs: move the node header size to struct xfs_da_geometry ... Browse Code »

Move the node header size field to struct xfs_da_geometry, and remove
the now unused non-directory dir ops infrastructure.

Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Christoph Hellwig
2019-11-11 08:54:19 +0800

06 Nov, 2019

2 commits

e1d3d2188 xfs: use super s_id instead of struct xfs_mount m_fsname ... Browse Code »

Eliminate struct xfs_mount field m_fsname by using the super block s_id
field directly.

Signed-off-by: Ian Kent
Reviewed-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Ian Kent
2019-11-06 00:28:25 +0800
f676c7508 xfs: remove unused struct xfs_mount field m_fsname_len ... Browse Code »

The struct xfs_mount field m_fsname_len is not used anywhere, remove it.

Signed-off-by: Ian Kent
Reviewed-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Ian Kent
2019-11-06 00:28:25 +0800

30 Oct, 2019

7 commits

7c6b94b1b xfs: reverse the polarity of XFS_MOUNT_COMPAT_IOSIZE ... Browse Code »

Replace XFS_MOUNT_COMPAT_IOSIZE with an inverted XFS_MOUNT_LARGEIO flag
that makes the usage more clear.

Signed-off-by: Christoph Hellwig
Reviewed-by: Eric Sandeen
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Christoph Hellwig
2019-10-30 00:50:13 +0800
3274d0080 xfs: rename the XFS_MOUNT_DFLT_IOSIZE option to ... Browse Code »

Make the flag match the mount option and usage.

Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Christoph Hellwig
2019-10-30 00:50:13 +0800
2fcddee8c xfs: simplify parsing of allocsize mount option ... Browse Code »

Rework xfs_parseargs to fill out the default value and then parse the
option directly into the mount structure, similar to what we do for
other updates, and open code the now trivial updates based on on the
on-disk superblock directly into xfs_mountfs.

Note that this change rejects the allocsize=0 mount option that has been
documented as invalid for a long time instead of just ignoring it.

Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Christoph Hellwig
2019-10-30 00:50:13 +0800
5da8a07c7 xfs: rename the m_writeio_* fields in struct xfs_mount ... Browse Code »

Use the allocsize name to match the mount option and usage instead.

Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Christoph Hellwig
2019-10-30 00:50:12 +0800
3cd1d18b0 xfs: remove the m_readio_* fields in struct xfs_mount ... Browse Code »

m_readio_blocks is entirely unused, and m_readio_blocks is only used in
xfs_stat_blksize in a max statements that is a no-op as it always has
the same value as m_writeio_log.

Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Christoph Hellwig
2019-10-30 00:50:12 +0800
b5ad616c3 xfs: don't use a different allocsice for -o wsync ... Browse Code »

The -o wsync allocsize overwrite overwrite was part of a special hack
for NFSv2 servers in IRIX and has no real purpose in modern Linux, so
remove it.

Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Christoph Hellwig
2019-10-30 00:50:12 +0800
dd2d535e3 xfs: cleanup calculating the stat optimal I/O size ... Browse Code »

Move xfs_preferred_iosize to xfs_iops.c, unobsfucate it and also handle
the realtime special case in the helper.

Signed-off-by: Christoph Hellwig
Reviewed-by: Eric Sandeen
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Christoph Hellwig
2019-10-30 00:50:12 +0800

27 Aug, 2019

1 commit

0ad95687c xfs: add kmem allocation trace points ... Browse Code »

When trying to correlate XFS kernel allocations to memory reclaim
behaviour, it is useful to know what allocations XFS is actually
attempting. This information is not directly available from
tracepoints in the generic memory allocation and reclaim
tracepoints, so these new trace points provide a high level
indication of what the XFS memory demand actually is.

There is no per-filesystem context in this code, so we just trace
the type of allocation, the size and the allocation constraints.
The kmem code also doesn't include much of the common XFS headers,
so there are a few definitions that need to be added to the trace
headers and a couple of types that need to be made common to avoid
needing to include the whole world in the kmem code.

Signed-off-by: Dave Chinner
Reviewed-by: Brian Foster
Reviewed-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Dave Chinner
2019-08-27 08:43:14 +0800

29 Jun, 2019

1 commit

1058d0f5e xfs: move the log ioend workqueue to struct xlog ... Browse Code »

Move the workqueue used for log I/O completions from struct xfs_mount
to struct xlog to keep it self contained in the log code.

Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
[darrick: destroy the log workqueue after ensuring log ios are done]
Signed-off-by: Darrick J. Wong

Christoph Hellwig
2019-06-29 10:27:25 +0800

12 Jun, 2019

2 commits

8c9ce2f70 xfs: remove unused flags arg from getsb interfaces ... Browse Code »

The flags value is always passed as 0 so remove the argument.

Signed-off-by: Eric Sandeen
Reviewed-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Eric Sandeen
2019-06-12 23:59:58 +0800
ef3259599 xfs: separate inode geometry ... Browse Code »

Separate the inode geometry information into a distinct structure.

Signed-off-by: Darrick J. Wong
Reviewed-by: Dave Chinner

Darrick J. Wong
2019-06-12 23:37:40 +0800

27 Apr, 2019

1 commit

9fe82b8c4 xfs: track delayed allocation reservations across the filesystem ... Browse Code »

Add a percpu counter to track the number of blocks directly reserved for
delayed allocations on the data device. This counter (in contrast to
i_delayed_blks) does not track allocated CoW staging extents or anything
going on with the realtime device. It will be used in the upcoming
summary counter scrub function to check the free block counts without
having to freeze the filesystem or walk all the inodes to find the
delayed allocations.

Signed-off-by: Darrick J. Wong
Reviewed-by: Dave Chinner

Darrick J. Wong
2019-04-27 03:28:55 +0800

17 Apr, 2019

1 commit

284082437 xfs: remove unused m_data_workqueue ... Browse Code »

Now that we're no longer using m_data_workqueue, remove it.

Signed-off-by: Darrick J. Wong
Reviewed-by: Brian Foster

Darrick J. Wong
2019-04-17 01:01:58 +0800

15 Apr, 2019

2 commits

39353ff6e xfs: replace the BAD_SUMMARY mount flag with the equivalent health code ... Browse Code »

Replace the BAD_SUMMARY mount flag with calls to the equivalent health
tracking code.

Signed-off-by: Darrick J. Wong
Reviewed-by: Brian Foster

Darrick J. Wong
2019-04-15 09:15:57 +0800
6772c1f11 xfs: track metadata health status ... Browse Code »

Add the necessary in-core metadata fields to keep track of which parts
of the filesystem have been observed and which parts were observed to be
unhealthy, and print a warning at unmount time if we have unfixed
problems.

Signed-off-by: Darrick J. Wong
Reviewed-by: Brian Foster

Darrick J. Wong
2019-04-15 09:15:57 +0800

21 Feb, 2019

1 commit

66ae56a53 xfs: introduce an always_cow mode ... Browse Code »

Add a mode where XFS never overwrites existing blocks in place. This
is to aid debugging our COW code, and also put infatructure in place
for things like possible future support for zoned block devices, which
can't support overwrites.

This mode is enabled globally by doing a:

echo 1 > /sys/fs/xfs/debug/always_cow

Note that the parameter is global to allow running all tests in xfstests
easily in this mode, which would not easily be possible with a per-fs
sysfs file.

In always_cow mode persistent preallocations are disabled, and fallocate
will fail when called with a 0 mode (with our without
FALLOC_FL_KEEP_SIZE), and not create unwritten extent for zeroed space
when called with FALLOC_FL_ZERO_RANGE or FALLOC_FL_UNSHARE_RANGE.

There are a few interesting xfstests failures when run in always_cow
mode:

- generic/392 fails because the bytes used in the file used to test
hole punch recovery are less after the log replay. This is
because the blocks written and then punched out are only freed
with a delay due to the logging mechanism.
- xfs/170 will fail as the already fragile file streams mechanism
doesn't seem to interact well with the COW allocator
- xfs/180 xfs/182 xfs/192 xfs/198 xfs/204 and xfs/208 will claim
the file system is badly fragmented, but there is not much we
can do to avoid that when always writing out of place
- xfs/205 fails because overwriting a file in always_cow mode
will require new space allocation and the assumption in the
test thus don't work anymore.
- xfs/326 fails to modify the file at all in always_cow mode after
injecting the refcount error, leading to an unexpected md5sum
after the remount, but that again is expected

Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Christoph Hellwig
2019-02-21 23:55:07 +0800

15 Feb, 2019

1 commit

e1f6ca113 xfs: rename m_inotbt_nores to m_finobt_nores ... Browse Code »

Rename this flag variable to imply more strongly that it's related to
the free inode btree (finobt) operation. No functional changes.

Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
Reviewed-by: Dave Chinner

Darrick J. Wong
2019-02-15 14:42:57 +0800

12 Feb, 2019

1 commit

9b2471797 xfs: cache unlinked pointers in an rhashtable ... Browse Code »

Use a rhashtable to cache the unlinked list incore. This should speed
up unlinked processing considerably when there are a lot of inodes on
the unlinked list because iunlink_remove no longer has to traverse an
entire bucket list to find which inode points to the one being removed.

The incore list structure records "X.next_unlinked = Y" relations, with
the rhashtable using Y to index the records. This makes finding the
inode X that points to a inode Y very quick. If our cache fails to find
anything we can always fall back on the old method.

FWIW this drastically reduces the amount of time it takes to remove
inodes from the unlinked list. I wrote a program to open a lot of
O_TMPFILE files and then close them in the same order, which takes
a very long time if we have to traverse the unlinked lists. With the
ptach, I see:

+ /d/t/tmpfile/tmpfile
Opened 193531 files in 6.33s.
Closed 193531 files in 5.86s

real 0m12.192s
user 0m0.064s
sys 0m11.619s
+ cd /
+ umount /mnt

real 0m0.050s
user 0m0.004s
sys 0m0.030s

And without the patch:

+ /d/t/tmpfile/tmpfile
Opened 193588 files in 6.35s.
Closed 193588 files in 751.61s

real 12m38.853s
user 0m0.084s
sys 12m34.470s
+ cd /
+ umount /mnt

real 0m0.086s
user 0m0.000s
sys 0m0.060s

Signed-off-by: Darrick J. Wong
Reviewed-by: Brian Foster

Darrick J. Wong
2019-02-12 08:07:01 +0800

13 Dec, 2018

3 commits

355e35321 xfs: cache minimum realtime summary level ... Browse Code »

The realtime summary is a two-dimensional array on disk, effectively:

u32 rsum[log2(number of realtime extents) + 1][number of blocks in the bitmap]

rsum[log][bbno] is the number of extents of size 2**log which start in
bitmap block bbno.

xfs_rtallocate_extent_near() uses xfs_rtany_summary() to check whether
rsum[log][bbno] != 0 for any log level. However, the summary array is
stored in row-major order (i.e., like an array in C), so all of these
entries are not adjacent, but rather spread across the entire summary
file. In the worst case (a full bitmap block), xfs_rtany_summary() has
to check every level.

This means that on a moderately-used realtime device, an allocation will
waste a lot of time finding, reading, and releasing buffers for the
realtime summary. In particular, one of our storage services (which runs
on servers with 8 very slow CPUs and 15 8 TB XFS realtime filesystems)
spends almost 5% of its CPU cycles in xfs_rtbuf_get() and
xfs_trans_brelse() called from xfs_rtany_summary().

One solution would be to also store the summary with the dimensions
swapped. However, this would require a disk format change to a very old
component of XFS.

Instead, we can cache the minimum size which contains any extents. We do
so lazily; rather than guaranteeing that the cache contains the precise
minimum, it always contains a loose lower bound which we tighten when we
read or update a summary block. This only uses a few kilobytes of memory
and is already serialized via the realtime bitmap and summary inode
locks, so the cost is minimal. With this change, the same workload only
spends 0.2% of its CPU cycles in the realtime allocator.

Signed-off-by: Omar Sandoval
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Omar Sandoval
2018-12-13 00:47:17 +0800
c1b4a321e xfs: precalculate cluster alignment in inodes and blocks ... Browse Code »

Store the inode cluster alignment information in units of inodes and
blocks in the mount data so that we don't have to keep recalculating
them.

Signed-off-by: Darrick J. Wong
Reviewed-by: Brian Foster

Darrick J. Wong
2018-12-13 00:47:17 +0800
83dcdb446 xfs: precalculate inodes and blocks per inode cluster ... Browse Code »

Store the number of inodes and blocks per inode cluster in the mount
data so that we don't have to keep recalculating them.

Signed-off-by: Darrick J. Wong
Reviewed-by: Brian Foster

Darrick J. Wong
2018-12-13 00:47:17 +0800

27 Jul, 2018

1 commit

1c02d502c xfs: remove deprecated barrier/nobarrier mount ... Browse Code »

The barrier mount options have been no-ops and deprecated since

4cf4573 xfs: deprecate barrier/nobarrier mount option

i.e. kernel 4.10 / December 2016, with a stated deprecation schedule
after v4.15. Should be fair game to remove them now.

Signed-off-by: Eric Sandeen
Reviewed-by: Carlos Maiolino
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Eric Sandeen
2018-07-27 01:15:17 +0800

24 Jul, 2018

1 commit

f467cad95 xfs: force summary counter recalc at next mount ... Browse Code »

Use the "bad summary count" mount flag from the previous patch to skip
writing the unmount record to force log recovery at the next mount,
which will recalculate the summary counters for us.

Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig

Darrick J. Wong
2018-07-24 00:08:01 +0800