Eric Lee / smarc-fsl-linux-kernel

16 Sep, 2020

1 commit

664ffb8a4 xfs: move the buffer retry logic to xfs_buf.c ... Browse Code »

Move the buffer retry state machine logic to xfs_buf.c and call it once
from xfs_ioend instead of duplicating it three times for the three kinds
of buffers.

Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Christoph Hellwig
2020-09-16 11:52:38 +0800

07 Sep, 2020

1 commit

718ecc503 xfs: xfs_iflock is no longer a completion ... Browse Code »

With the recent rework of the inode cluster flushing, we no longer
ever wait on the the inode flush "lock". It was never a lock in the
first place, just a completion to allow callers to wait for inode IO
to complete. We now never wait for flush completion as all inode
flushing is non-blocking. Hence we can get rid of all the iflock
infrastructure and instead just set and check a state flag.

Rename the XFS_IFLOCK flag to XFS_IFLUSHING, convert all the
xfs_iflock_nowait() test-and-set operations on that flag, and
replace all the xfs_ifunlock() calls to clear operations.

Signed-off-by: Dave Chinner
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Dave Chinner
2020-09-07 09:05:51 +0800

07 Jul, 2020

3 commits

aac855ab1 xfs: make inode IO completion buffer centric ... Browse Code »

Having different io completion callbacks for different inode states
makes things complex. We can detect if the inode is stale via the
XFS_ISTALE flag in IO completion, so we don't need a special
callback just for this.

This means inodes only have a single iodone callback, and inode IO
completion is entirely buffer centric at this point. Hence we no
longer need to use a log item callback at all as we can just call
xfs_iflush_done() directly from the buffer completions and walk the
buffer log item list to complete the all inodes under IO.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Reviewed-by: Brian Foster
Signed-off-by: Darrick J. Wong

Dave Chinner
2020-07-07 01:46:59 +0800
1319ebefd xfs: add an inode item lock ... Browse Code »

The inode log item is kind of special in that it can be aggregating
new changes in memory at the same time time existing changes are
being written back to disk. This means there are fields in the log
item that are accessed concurrently from contexts that don't share
any locking at all.

e.g. updating ili_last_fields occurs at flush time under the
ILOCK_EXCL and flush lock at flush time, under the flush lock at IO
completion time, and is read under the ILOCK_EXCL when the inode is
logged. Hence there is no actual serialisation between reading the
field during logging of the inode in transactions vs clearing the
field in IO completion.

We currently get away with this by the fact that we are only
clearing fields in IO completion, and nothing bad happens if we
accidentally log more of the inode than we actually modify. Worst
case is we consume a tiny bit more memory and log bandwidth.

However, if we want to do more complex state manipulations on the
log item that requires updates at all three of these potential
locations, we need to have some mechanism of serialising those
operations. To do this, introduce a spinlock into the log item to
serialise internal state.

This could be done via the xfs_inode i_flags_lock, but this then
leads to potential lock inversion issues where inode flag updates
need to occur inside locks that best nest inside the inode log item
locks (e.g. marking inodes stale during inode cluster freeing).
Using a separate spinlock avoids these sorts of problems and
simplifies future code.

This does not touch the use of ili_fields in the item formatting
code - that is entirely protected by the ILOCK_EXCL at this point in
time, so it remains untouched.

Signed-off-by: Dave Chinner
Reviewed-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Dave Chinner
2020-07-07 01:46:58 +0800
1dfde687a xfs: remove logged flag from inode log item ... Browse Code »

This was used to track if the item had logged fields being flushed
to disk. We log everything in the inode these days, so this logic is
no longer needed. Remove it.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Reviewed-by: Brian Foster
Signed-off-by: Darrick J. Wong

Dave Chinner
2020-07-07 01:46:58 +0800

07 May, 2020

1 commit

88fc18798 xfs: remove unused iflush stale parameter ... Browse Code »

The stale parameter was used to control the now unused shutdown
parameter of xfs_trans_ail_remove().

Signed-off-by: Brian Foster
Reviewed-by: Dave Chinner
Reviewed-by: Allison Collins
Reviewed-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Brian Foster
2020-05-07 23:27:48 +0800

05 May, 2020

1 commit

fd9cbe512 xfs: remove the xfs_inode_log_item_t typedef ... Browse Code »

Signed-off-by: Christoph Hellwig
Reviewed-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Christoph Hellwig
2020-05-05 00:03:16 +0800

29 Jun, 2019

1 commit

efe2330fd xfs: remove the xfs_log_item_t typedef ... Browse Code »

Signed-off-by: Christoph Hellwig
Reviewed-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Christoph Hellwig
2019-06-29 10:27:33 +0800

07 Jun, 2018

1 commit

0b61f8a40 xfs: convert to SPDX license tags ... Browse Code »

Remove the verbose license text from XFS files and replace them
with SPDX tags. This does not change the license of any of the code,
merely refers to the common, up-to-date license files in LICENSES/

This change was mostly scripted. fs/xfs/Makefile and
fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected
and modified by the following command:

for f in `git grep -l "GNU General" fs/xfs/` ; do
echo $f
cat $f | awk -f hdr.awk > $f.new
mv -f $f.new $f
done

And the hdr.awk script that did the modification (including
detecting the difference between GPL-2.0 and GPL-2.0+ licenses)
is as follows:

$ cat hdr.awk
BEGIN {
hdr = 1.0
tag = "GPL-2.0"
str = ""
}

/^ \* This program is free software/ {
hdr = 2.0;
next
}

/any later version./ {
tag = "GPL-2.0+"
next
}

/^ \*\// {
if (hdr > 0.0) {
print "// SPDX-License-Identifier: " tag
print str
print $0
str=""
hdr = 0.0
next
}
print $0
next
}

/^ \* / {
if (hdr > 1.0)
next
if (hdr > 0.0) {
if (str != "")
str = str "\n"
str = str $0
next
}
print $0
next
}

/^ \*/ {
if (hdr > 0.0)
next
print $0
next
}

// {
if (hdr > 0.0) {
if (str != "")
str = str "\n"
str = str $0
next
}
print $0
}

END { }
$

Signed-off-by: Dave Chinner
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Dave Chinner
2018-06-07 05:17:53 +0800

02 Nov, 2017

1 commit

06b113212 xfs: remove inode log format typedef ... Browse Code »

Remove xfs_inode_log_format_t now that xfs_inode_log_format is
explicitly padded and therefore is a real on-disk structure. This
enables xfs/122 to check the size of the structure.

Signed-off-by: Darrick J. Wong
Reviewed-by: Dave Chinner

Darrick J. Wong
2017-11-02 06:03:16 +0800

03 Nov, 2015

1 commit

fc0561cef xfs: optimise away log forces on timestamp updates for fdatasync ... Browse Code »

xfs: timestamp updates cause excessive fdatasync log traffic

Sage Weil reported that a ceph test workload was writing to the
log on every fdatasync during an overwrite workload. Event tracing
showed that the only metadata modification being made was the
timestamp updates during the write(2) syscall, but fdatasync(2)
is supposed to ignore them. The key observation was that the
transactions in the log all looked like this:

INODE: #regs: 4 ino: 0x8b flags: 0x45 dsize: 32

And contained a flags field of 0x45 or 0x85, and had data and
attribute forks following the inode core. This means that the
timestamp updates were triggering dirty relogging of previously
logged parts of the inode that hadn't yet been flushed back to
disk.

There are two parts to this problem. The first is that XFS relogs
dirty regions in subsequent transactions, so it carries around the
fields that have been dirtied since the last time the inode was
written back to disk, not since the last time the inode was forced
into the log.

The second part is that on v5 filesystems, the inode change count
update during inode dirtying also sets the XFS_ILOG_CORE flag, so
on v5 filesystems this makes a timestamp update dirty the entire
inode.

As a result when fdatasync is run, it looks at the dirty fields in
the inode, and sees more than just the timestamp flag, even though
the only metadata change since the last fdatasync was just the
timestamps. Hence we force the log on every subsequent fdatasync
even though it is not needed.

To fix this, add a new field to the inode log item that tracks
changes since the last time fsync/fdatasync forced the log to flush
the changes to the journal. This flag is updated when we dirty the
inode, but we do it before updating the change count so it does not
carry the "core dirty" flag from timestamp updates. The fields are
zeroed when the inode is marked clean (due to writeback/freeing) or
when an fsync/datasync forces the log. Hence if we only dirty the
timestamps on the inode between fsync/fdatasync calls, the fdatasync
will not trigger another log force.

Over 100 runs of the test program:

Ext4 baseline:
runtime: 1.63s +/- 0.24s
avg lat: 1.59ms +/- 0.24ms
iops: ~2000

XFS, vanilla kernel:
runtime: 2.45s +/- 0.18s
avg lat: 2.39ms +/- 0.18ms
log forces: ~400/s
iops: ~1000

XFS, patched kernel:
runtime: 1.49s +/- 0.26s
avg lat: 1.46ms +/- 0.25ms
log forces: ~30/s
iops: ~1500

Reported-by: Sage Weil
Signed-off-by: Dave Chinner
Reviewed-by: Brian Foster
Signed-off-by: Dave Chinner

Dave Chinner
2015-11-03 10:14:59 +0800

13 Dec, 2013

2 commits

2f251293b xfs: remove the inode log format from the inode log item ... Browse Code »

No need to keep the inode log format around all the time, we can
easily generate it at iop_format time.

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Dave Chinner

Christoph Hellwig
2013-12-13 08:34:05 +0800
da7765031 xfs: format logged extents directly into the CIL ... Browse Code »

With the new iop_format scheme there is no need to have a temporary buffer
to format logged extents into, we can do so directly into the CIL. This
also allows to remove the shortcut for big endian systems that probably
hasn't gotten a lot of test coverage for a long time.

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Dave Chinner

Christoph Hellwig
2013-12-13 08:34:04 +0800

13 Aug, 2013

1 commit

69432832f xfs: split out inode log item format definition ... Browse Code »

The log item format definitions are shared with userspace. Split
them out of header files that contain kernel only defintions to make
it simple to shared them.

Signed-off-by: Dave Chinner
Reviewed-by: Brian Foster
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2013-08-13 05:05:19 +0800

18 Dec, 2012

1 commit

ec47eb6b0 xfs remove the XFS_TRANS_DEBUG routines ... Browse Code »

Remove the XFS_TRANS_DEBUG routines. They are no longer appropriate
and have not been used in years

Signed-off-by: Mark Tinguely
Reviewed-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Ben Myers

Mark Tinguely
2012-12-18 06:29:00 +0800

15 May, 2012

1 commit

04913fdd9 xfs: pass shutdown method into xfs_trans_ail_delete_bulk ... Browse Code »

xfs_trans_ail_delete_bulk() can be called from different contexts so
if the item is not in the AIL we need different shutdown for each
context. Pass in the shutdown method needed so the correct action
can be taken.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Dave Chinner
2012-05-15 05:20:33 +0800

14 Mar, 2012

3 commits

8f639ddea xfs: reimplement fdatasync support ... Browse Code »

Add an in-memory only flag to say we logged timestamps only, and use it to
check if fdatasync can optimize away the log force.

Reviewed-by: Dave Chinner
Signed-off-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Christoph Hellwig
2012-03-14 06:18:14 +0800
f5d8d5c4b xfs: split in-core and on-disk inode log item fields ... Browse Code »

Add a new ili_fields member to the inode log item to isolate the in-memory
flags from the ones that actually go to the log. This will allow tracking
timestamp-only updates for fdatasync and O_DSYNC in the next patch and
prepares for divorcing the on-disk log format from the in-memory log item
a little further down the road.

Reviewed-by: Dave Chinner
Signed-off-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Christoph Hellwig
2012-03-14 06:08:17 +0800
8a9c9980f xfs: log timestamp updates ... Browse Code »

Timestamps on regular files are the last metadata that XFS does not update
transactionally. Now that we use the delaylog mode exclusively and made
the log scode scale extremly well there is no need to bypass that code for
timestamp updates. Logging all updates allows to drop a lot of code, and
will allow for further performance improvements later on.

Note that this patch drops optimized handling of fdatasync - it will be
added back in a separate commit.

Reviewed-by: Dave Chinner
Signed-off-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers

Christoph Hellwig
2012-03-14 06:01:15 +0800

27 Jul, 2010

2 commits

898621d5a xfs: simplify inode to transaction joining ... Browse Code »

Currently we need to either call IHOLD or xfs_trans_ihold on an inode when
joining it to a transaction via xfs_trans_ijoin.

This patches instead makes xfs_trans_ijoin usable on it's own by doing
an implicity xfs_trans_ihold, which also allows us to drop the third
argument. For the case where we want to hold a reference on the inode
a xfs_trans_ijoin_ref wrapper is added which does the IHOLD and marks
the inode for needing an xfs_iput. In addition to the cleaner interface
to the caller this also simplifies the implementation.

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner

Christoph Hellwig
2010-07-27 02:16:36 +0800
ca30b2a7b xfs: give li_cb callbacks the correct prototype ... Browse Code »

Stop the function pointer casting madness and give all the li_cb instances
correct prototype.

Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner

Christoph Hellwig
2010-07-27 02:16:35 +0800

02 Feb, 2010

1 commit

d808f617a xfs: Don't issue buffer IO direct from AIL push V2 ... Browse Code »

All buffers logged into the AIL are marked as delayed write.
When the AIL needs to push the buffer out, it issues an async write of the
buffer. This means that IO patterns are dependent on the order of
buffers in the AIL.

Instead of flushing the buffer, promote the buffer in the delayed
write list so that the next time the xfsbufd is run the buffer will
be flushed by the xfsbufd. Return the state to the xfsaild that the
buffer was promoted so that the xfsaild knows that it needs to cause
the xfsbufd to run to flush the buffers that were promoted.

Using the xfsbufd for issuing the IO allows us to dispatch all
buffer IO from the one queue. This means that we can make much more
enlightened decisions on what order to flush buffers to disk as
we don't have multiple places issuing IO. Optimisations to xfsbufd
will be in a future patch.

Version 2
- kill XFS_ITEM_FLUSHING as it is now unused.

Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig

Dave Chinner
2010-02-02 07:13:42 +0800

17 Dec, 2009

1 commit

a5f9be58c xfs: kill xfs_bmbt_rec_32/64 types ... Browse Code »

For a long time we've always stored bmap btree records in the 64bit format,
so kill off the dead 32bit type, and make sure the 64bit type is named just
xfs_bmbt_rec everywhere, without any size postfix.

Signed-off-by: Christoph Hellwig
Reviewed-by: Eric Sandeen
Signed-off-by: Alex Elder

Christoph Hellwig
2009-12-17 03:41:20 +0800

02 Sep, 2009

1 commit

aa72a5cf0 xfs: simplify xfs_trans_iget ... Browse Code »

xfs_trans_iget is a wrapper for xfs_iget that adds the inode to the
transaction after it is read. Except when the inode already is in the
inode cache, in which case it returns the existing locked inode with
increment lock recursion counts.

Now, no one in the tree every decrements these lock recursion counts,
so any user of this gets a potential double unlock when both the original
owner of the inode and the xfs_trans_iget caller unlock it. When looking
back in a git bisect in the historic XFS tree there was only one place
that decremented these counts, xfs_trans_iput. Introduced in commit
ca25df7a840f426eb566d52667b6950b92bb84b5 by Adam Sweeney in 1993,
and removed in commit 19f899a3ab155ff6a49c0c79b06f2f61059afaf3 by
Steve Lord in 2003. And as long as it didn't slip through git bisects
cracks never actually used in that time frame.

A quick audit of the callers of xfs_trans_iget shows that no caller
really relies on this behaviour fortunately - xfs_ialloc allows this
inode from disk so it must not be there before, and all the RT allocator
routines only every add each RT bitmap inode once.

In addition to removing lots of code and reducing the size of the inode
item this patch also avoids the double inode cache lookup in each
create/mkdir/mknod transaction.

Signed-off-by: Christoph Hellwig
Reviewed-by: Alex Elder
Signed-off-by: Felix Blyakher

Christoph Hellwig
2009-09-02 01:46:16 +0800

22 Jan, 2009

1 commit

5253a11a8 [XFS] remove always-true #ifndef HAVE_FORMAT32 tests ... Browse Code »

There are several tests for #ifndef HAVE_FORMAT32, but
this is never defined anywhere so it is always the default
behavior; just remove the ifndef goop.

Signed-off-by: Eric Sandeen
Reviewed-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy

Eric Sandeen
2009-01-22 11:07:31 +0800

16 Jan, 2009

1 commit

9d87c3192 [XFS] Remove the rest of the macro-to-function indirections. ... Browse Code »

Remove the last of the macros-defined-to-static-functions.

Signed-off-by: Eric Sandeen
Reviewed-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy

Eric Sandeen
2009-01-16 14:10:42 +0800

30 Oct, 2008

1 commit

847fff5ca [XFS] Sync up kernel and user-space headers ... Browse Code »

SGI-PV: 986558

SGI-Modid: xfs-linux-melb:xfs-kern:32231a

Signed-off-by: Barry Naujok
Signed-off-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy

Barry Naujok
2008-10-30 14:05:38 +0800

18 Apr, 2008

1 commit

335404089 [XFS] Use xfs_inode_clean() in more places ... Browse Code »

Remove open coded checks for the whether the inode is clean and replace
them with an inlined function.

SGI-PV: 977461
SGI-Modid: xfs-linux-melb:xfs-kern:30503a

Signed-off-by: David Chinner
Signed-off-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy

David Chinner
2008-04-18 09:37:51 +0800

28 Sep, 2006

1 commit

128dabc5e [XFS] cleanup the field types of some item format structures ... Browse Code »

SGI-PV: 954365
SGI-Modid: xfs-linux-melb:xfs-kern:26406a

Signed-off-by: Tim Shimmin

Tim Shimmin
2006-09-28 08:55:43 +0800

09 Jun, 2006

1 commit

6d192a9b8 [XFS] inode items and EFI/EFDs have different ondisk format for 32bit and ... Browse Code »

64bit kernels allow recovery to handle both versions and do the necessary
decoding

SGI-PV: 952214
SGI-Modid: xfs-linux-melb:xfs-kern:26011a

Signed-off-by: Tim Shimmin
Signed-off-by: Nathan Scott

Tim Shimmin
2006-06-09 12:55:38 +0800

02 Nov, 2005

2 commits

7b7187698 [XFS] Update license/copyright notices to match the prefered SGI ... Browse Code »

boilerplate.

SGI-PV: 913862
SGI-Modid: xfs-linux:xfs-kern:23903a

Signed-off-by: Nathan Scott

Nathan Scott
2005-11-02 11:58:39 +0800
a844f4510 [XFS] Remove xfs_macros.c, xfs_macros.h, rework headers a whole lot. ... Browse Code »

SGI-PV: 943122
SGI-Modid: xfs-linux:xfs-kern:23901a

Signed-off-by: Nathan Scott

Nathan Scott
2005-11-02 11:38:42 +0800

17 Apr, 2005

1 commit

1da177e4c Linux-2.6.12-rc2 ... Browse Code »

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

Linus Torvalds
2005-04-17 06:20:36 +0800