05 Aug, 2020
1 commit
-
Delete repeated words in fs/xfs/.
{we, that, the, a, to, fork}
Change "it it" to "it is" in one location.Signed-off-by: Randy Dunlap
To: linux-fsdevel@vger.kernel.org
Cc: Darrick J. Wong
Cc: linux-xfs@vger.kernel.org
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
29 Jul, 2020
1 commit
-
xlog_ticket_alloc() is always called under NOFS context, except from
unmount path, which eitherway is holding many FS locks, so, there is no
need for its callers to keep passing allocation flags into it.change xlog_ticket_alloc() to use default kmem_cache_zalloc(), remove
its alloc_flags argument, and always use GFP_NOFS | __GFP_NOFAIL flags.Reviewed-by: Christoph Hellwig
Signed-off-by: Carlos Maiolino
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
Reviewed-by: Dave Chinner
23 Jun, 2020
1 commit
-
xlog_wait() on the CIL context can reference a freed context if the
waiter doesn't get scheduled before the CIL context is freed. This
can happen when a task is on the hard throttle and the CIL push
aborts due to a shutdown. This was detected by generic/019:thread 1 thread 2
__xfs_trans_commit
xfs_log_commit_cil
xlog_wait
schedule
xlog_cil_push_work
wake_up_all
xlog_cil_committed
kmem_freeremove_wait_queue
spin_lock_irqsave --> UAFFix it by moving the wait queue to the CIL rather than keeping it in
in the CIL context that gets freed on push completion. Because the
wait queue is now independent of the CIL context and we might have
multiple contexts in flight at once, only wake the waiters on the
push throttle when the context we are pushing is over the hard
throttle size threshold.Fixes: 0e7ab7efe7745 ("xfs: Throttle commits on delayed background CIL push")
Reported-by: Yu Kuai
Signed-off-by: Dave Chinner
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
27 Mar, 2020
5 commits
-
In certain situations the background CIL push can be indefinitely
delayed. While we have workarounds from the obvious cases now, it
doesn't solve the underlying issue. This issue is that there is no
upper limit on the CIL where we will either force or wait for
a background push to start, hence allowing the CIL to grow without
bound until it consumes all log space.To fix this, add a new wait queue to the CIL which allows background
pushes to wait for the CIL context to be switched out. This happens
when the push starts, so it will allow us to block incoming
transaction commit completion until the push has started. This will
only affect processes that are running modifications, and only when
the CIL threshold has been significantly overrun.This has no apparent impact on performance, and doesn't even trigger
until over 45 million inodes had been created in a 16-way fsmark
test on a 2GB log. That was limiting at 64MB of log space used, so
the active CIL size is only about 3% of the total log in that case.
The concurrent removal of those files did not trigger the background
sleep at all.Signed-off-by: Dave Chinner
Reviewed-by: Allison Collins
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
xlog_write_done() is just a thin wrapper around xlog_commit_record(), so
they can be merged together easily.Signed-off-by: Dave Chinner
Signed-off-by: Christoph Hellwig
Reviewed-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
Remove xlog_ticket_done and just call the renamed low-level helpers for
ungranting or regranting log space directly. To make that a little
the reference put on the ticket and all tracing is moved into the actual
helpers.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
xfs_log_done() does two separate things. Firstly, it triggers commit
records to be written for permanent transactions, and secondly it
releases or regrants transaction reservation space.Since delayed logging was introduced, transactions no longer write
directly to the log, hence they never have the XLOG_TIC_INITED flag
cleared on them. Hence transactions never write commit records to
the log and only need to modify reservation space.Split up xfs_log_done into two parts, and only call the parts of the
operation needed for the context xfs_log_done() is currently being
called from.Signed-off-by: Dave Chinner
Signed-off-by: Christoph Hellwig
Reviewed-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
The xlog_write() function iterates over iclogs until it completes
writing all the log vectors passed in. The ticket tracks whether
a start record has been written or not, so only the first iclog gets
a start record. We only ever pass single use tickets to
xlog_write() so we only ever need to write a start record once per
xlog_write() call.Hence we don't need to store whether we should write a start record
in the ticket as the callers provide all the information we need to
determine if a start record should be written. For the moment, we
have to ensure that we clear the XLOG_TIC_INITED appropriately so
the code in xfs_log_done() still works correctly for committing
transactions.(darrick: Note the slight behavior change that we always deduct the
size of the op header from the ticket, even for unmount records)Signed-off-by: Dave Chinner
[hch: pass an explicit need_start_rec argument]
Signed-off-by: Christoph Hellwig
Reviewed-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
23 Mar, 2020
3 commits
-
We can just check for a shut down log all the way down in
xlog_cil_committed instead of passing the parameter. This means a
slight behavior change in that we now also abort log items if the
shutdown came in halfway into the I/O completion processing, which
actually is the right thing to do.Signed-off-by: Christoph Hellwig
Reviewed-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
The only caller of xfs_log_release_iclog doesn't care about the return
value, so remove it. Also don't bother passing the mount pointer,
given that we can trivially derive it from the iclog.Signed-off-by: Christoph Hellwig
Reviewed-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
xlog_cil_push is only called by xlog_cil_push_work, so merge the two
functions.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
11 Nov, 2019
1 commit
-
Just fix the typos checkpatch notices...
Signed-off-by: Joe Perches
Reviewed-by: Bill O'Donnell
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
22 Oct, 2019
1 commit
-
ic_state really is a set of different states, even if the values are
encoded as non-conflicting bits and we sometimes use logical and
operations to check for them. Switch all comparisms to check for
exact values (and use switch statements in a few places to make it
more clear) and turn the values into an implicitly enumerated enum
type.Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
Reviewed-by: Brian Foster
27 Aug, 2019
1 commit
-
Since no caller is using KM_NOSLEEP and no callee branches on KM_SLEEP,
we can remove KM_NOSLEEP and replace KM_SLEEP with 0.Signed-off-by: Tetsuo Handa
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
29 Jun, 2019
6 commits
-
There are many, many xfs header files which are included but
unneeded (or included twice) in the xfs code, so remove them.nb: xfs_linux.h includes about 9 headers for everyone, so those
explicit includes get removed by this. I'm not sure what the
preference is, but if we wanted explicit includes everywhere,
a followup patch could remove those xfs_*.h includes from
xfs_linux.h and move them into the files that need them.
Or it could be left as-is.Signed-off-by: Eric Sandeen
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
Replace the hand grown linked list handling and cil context attachment
with the standard list_head structure.Signed-off-by: Christoph Hellwig
Reviewed-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
The iop_unlock method is called when comitting or cancelling a
transaction. In the latter case, the transaction may or may not be
aborted. While there is no known problem with the current code in
practice, this implementation is limited in that any log item
implementation that might want to differentiate between a commit and a
cancellation must rely on the aborted state. The aborted bit is only
set when the cancelled transaction is dirty, however. This means that
there is no way to distinguish between a commit and a clean transaction
cancellation.For example, intent log items currently rely on this distinction. The
log item is either transferred to the CIL on commit or released on
transaction cancel. There is currently no possibility for a clean intent
log item in a transaction, but if that state is ever introduced a cancel
of such a transaction will immediately result in memory leaks of the
associated log item(s). This is an interface deficiency and landmine.To clean this up, replace the iop_unlock method with an iop_release
method that is specific to transaction cancel. The existing
iop_committing method occurs at the same time as iop_unlock in the
commit path and there is no need for two separate callbacks here.
Overload the iop_committing method with the current commit time
iop_unlock implementations to eliminate the need for the latter and
further simplify the interface.Signed-off-by: Christoph Hellwig
Reviewed-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
While commiting items looks very similar to freeing them on error it is
a different operation, and they will diverge a bit soon.Split out the commit case from xfs_trans_free_items, inline it into
xfs_log_commit_cil and give it a separate trace point.Signed-off-by: Christoph Hellwig
Reviewed-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
Just check if they are present first.
Signed-off-by: Christoph Hellwig
Reviewed-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
Just pass a straight bool aborted instead of abusing XFS_LI_ABORTED as a
flag in function parameters.Signed-off-by: Christoph Hellwig
Reviewed-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
15 Apr, 2019
1 commit
-
XFS shutdown deadlocks have been reproduced by fstest generic/475.
The deadlock signature involves log I/O completion running error
handling to abort logged items and waiting for an inode cluster
buffer lock in the buffer item unpin handler. The buffer lock is
held by xfsaild attempting to flush an inode. The buffer happens to
be pinned and so xfs_iflush() triggers an async log force to begin
work required to get it unpinned. The log force is blocked waiting
on the commit completion, which never occurs and thus leaves the
filesystem deadlocked.The root problem is that aborted log I/O completion pots commit
completion behind callback completion, which is unexpected for async
log forces. Under normal running conditions, an async log force
returns to the caller once the CIL ctx has been formatted/submitted
and the commit completion event triggered at the tail end of
xlog_cil_push(). If the filesystem has shutdown, however, we rely on
xlog_cil_committed() to trigger the completion event and it happens
to do so after running log item unpin callbacks. This makes it
unsafe to invoke an async log force from contexts that hold locks
that might also be required in log completion processing.To address this problem, wake commit completion waiters before
aborting log items in the log I/O completion handler. This ensures
that an async log force will not deadlock on held locks if the
filesystem happens to shutdown. Note that it is still unsafe to
issue a sync log force while holding such locks because a sync log
force explicitly waits on the force completion, which occurs after
log I/O completion processing.Signed-off-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
07 Jun, 2018
1 commit
-
Remove the verbose license text from XFS files and replace them
with SPDX tags. This does not change the license of any of the code,
merely refers to the common, up-to-date license files in LICENSES/This change was mostly scripted. fs/xfs/Makefile and
fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected
and modified by the following command:for f in `git grep -l "GNU General" fs/xfs/` ; do
echo $f
cat $f | awk -f hdr.awk > $f.new
mv -f $f.new $f
doneAnd the hdr.awk script that did the modification (including
detecting the difference between GPL-2.0 and GPL-2.0+ licenses)
is as follows:$ cat hdr.awk
BEGIN {
hdr = 1.0
tag = "GPL-2.0"
str = ""
}/^ \* This program is free software/ {
hdr = 2.0;
next
}/any later version./ {
tag = "GPL-2.0+"
next
}/^ \*\// {
if (hdr > 0.0) {
print "// SPDX-License-Identifier: " tag
print str
print $0
str=""
hdr = 0.0
next
}
print $0
next
}/^ \* / {
if (hdr > 1.0)
next
if (hdr > 0.0) {
if (str != "")
str = str "\n"
str = str $0
next
}
print $0
next
}/^ \*/ {
if (hdr > 0.0)
next
print $0
next
}// {
if (hdr > 0.0) {
if (str != "")
str = str "\n"
str = str $0
next
}
print $0
}END { }
$Signed-off-by: Dave Chinner
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
10 May, 2018
2 commits
-
It's just a connector between a transaction and a log item. There's
a 1:1 relationship between a log item descriptor and a log item,
and a 1:1 relationship between a log item descriptor and a
transaction. Both relationships are created and terminated at the
same time, so why do we even have the descriptor?Replace it with a specific list_head in the log item and a new
log item dirtied flag to replace the XFS_LID_DIRTY flag.Signed-Off-By: Dave Chinner
Reviewed-by: Brian Foster
Reviewed-by: Christoph Hellwig
[darrick: fix up deferred agfl intent finish_item use of LID_DIRTY]
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
Because currently we have no idea what the transaction context we
are operating in is, and I need to know that information to track
down bugs in multiple log item joins to transactions.Signed-Off-By: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
10 Apr, 2018
1 commit
-
Signed-off-by: Eric Sandeen
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
12 Mar, 2018
1 commit
-
When using large directory blocks, we regularly see memory
allocations of >64k being made for the shadow log vector buffer.
When we are under memory pressure, kmalloc() may not be able to find
contiguous memory chunks large enough to satisfy these allocations
easily, and if memory is fragmented we can potentially stall here.TO avoid this problem, switch the log vector buffer allocation to
use kmem_alloc_large(). This will allow failed allocations to fall
back to vmalloc and so remove the dependency on large contiguous
regions of memory being available. This should prevent slowdowns
and potential stalls when memory is low and/or fragmented.Signed-Off-By: Dave Chinner
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
05 Aug, 2017
1 commit
-
The bio describing discard operation is allocated by
__blkdev_issue_discard() which returns us a reference to it. That
reference is never released and thus we leak this bio. Drop the bio
reference once it completes in xlog_discard_endio().CC: stable@vger.kernel.org
Fixes: 4560e78f40cb55bd2ea8f1ef4001c5baa88531c7
Signed-off-by: Jan Kara
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
19 Jun, 2017
4 commits
-
The t_lsn is not used anymore and the t_commit_lsn is used as a tmp
storage for the checkpoint sequence number only in the current code.And the start/commit lsn are tracked as a transaction group tag in
the xfs_cil_ctx instead of a single transaction, so remove them from
the xfs_trans structure and their users to match with the design.Signed-off-by: Shan Hai
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
If a transaction log reservation overrun occurs, the ticket data
associated with the reservation is dumped in xfs_log_commit_cil().
This occurs long after the transaction items and details have been
removed from the transaction and effectively lost. This limited set
of ticket data provides very little information to support debugging
transaction overruns based on the typical report.To improve transaction log reservation overrun reporting, create a
helper to dump transaction details such as log items, log vector
data, etc., as well as the underlying ticket data for the
transaction. Move the overrun detection from xfs_log_commit_cil() to
xlog_cil_insert_items() so it occurs prior to migration of the
logged items to the CIL. Call the new helper such that it is able to
dump this transaction data before it is lost.Also, warn on overrun to provide callstack context for the offending
transaction and include a few additional messages from
xlog_cil_insert_items() to display the reservation consumed locally
for overhead such as log vector headers, split region headers and
the context ticket. This provides a complete general breakdown of
the reservation consumption of a transaction when/if it happens to
overrun the reservation.Signed-off-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
Transaction reservation overrun detection currently occurs too late
to print useful information about the offending transaction.
Ideally, the transaction data is printed before the associated log
items are moved from the transaction to the CIL, which occurs in
xlog_cil_insert_items(), such that details of the items logged by
the transaction are available for analysis.Refactor xlog_cil_insert_items() to facilitate moving tx overrun
detection to this function. Update the function to track each bit of
extra log reservation stolen from the transaction (i.e., such as for
the CIL context ticket) and perform the log item migration as the
last operation before the CIL lock is released. This creates a
context where the transaction reservation consumption has been fully
calculated when the log items are moved to the CIL. This patch makes
no functional changes.Signed-off-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
xlog_print_tic_res() pre-dates delayed logging and the committed
items list (CIL) and thus retains some factoring warts, such as hard
coded function names in the output and the fact that it induces a
shutdown.In preparation for more detailed logging of regular transaction
overrun situations, refactor xlog_print_tic_res() to be slightly
more generic. Reword some of the warning messages and pull the
shutdown into the callers.Signed-off-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
10 Feb, 2017
1 commit
-
Instead we submit the discard requests and use another workqueue to
release the extents from the extent busy list.Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
22 Jul, 2016
1 commit
-
One of the problems we currently have with delayed logging is that
under serious memory pressure we can deadlock memory reclaim. THis
occurs when memory reclaim (such as run by kswapd) is reclaiming XFS
inodes and issues a log force to unpin inodes that are dirty in the
CIL.The CIL is pushed, but this will only occur once it gets the CIL
context lock to ensure that all committing transactions are complete
and no new transactions start being committed to the CIL while the
push switches to a new context.The deadlock occurs when the CIL context lock is held by a
committing process that is doing memory allocation for log vector
buffers, and that allocation is then blocked on memory reclaim
making progress. Memory reclaim, however, is blocked waiting for
a log force to make progress, and so we effectively deadlock at this
point.To solve this problem, we have to move the CIL log vector buffer
allocation outside of the context lock so that memory reclaim can
always make progress when it needs to force the log. The problem
with doing this is that a CIL push can take place while we are
determining if we need to allocate a new log vector buffer for
an item and hence the current log vector may go away without
warning. That means we canot rely on the existing log vector being
present when we finally grab the context lock and so we must have a
replacement buffer ready to go at all times.To ensure this, introduce a "shadow log vector" buffer that is
always guaranteed to be present when we gain the CIL context lock
and format the item. This shadow buffer may or may not be used
during the formatting, but if the log item does not have an existing
log vector buffer or that buffer is too small for the new
modifications, we swap it for the new shadow buffer and format
the modifications into that new log vector buffer.The result of this is that for any object we modify more than once
in a given CIL checkpoint, we double the memory required
to track dirty regions in the log. For single modifications then
we consume the shadow log vectorwe allocate on commit, and that gets
consumed by the checkpoint. However, if we make multiple
modifications, then the second transaction commit will allocate a
shadow log vector and hence we will end up with double the memory
usage as only one of the log vectors is consumed by the CIL
checkpoint. The remaining shadow vector will be freed when th elog
item is freed.This can probably be optimised in future - access to the shadow log
vector is serialised by the object lock (as opposited to the active
log vector, which is controlled by the CIL context lock) and so we
can probably free shadow log vector from some objects when the log
item is marked clean on removal from the AIL.Signed-off-by: Dave Chinner
Reviewed-by: Brian Foster
Signed-off-by: Dave Chinner
06 Apr, 2016
1 commit
-
These aren't used for CIL-style logging and can be dropped.
Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Dave Chinner
29 Jul, 2015
1 commit
-
We have seen somewhat rare reports of the following assert from
xlog_cil_push_background() failing during ltp tests or somewhat
innocuous desktop root fs workloads (e.g., virt operations, initramfs
construction):ASSERT(!list_empty(&cil->xc_cil));
The reasoning behind the assert is that the transaction has inserted
items to the CIL and hit background push codepath all with
cil->xc_ctx_lock held for reading. This locks out background commit from
emptying the CIL, which acquires the lock for writing. Therefore, the
reasoning is that the items previously inserted in the CIL should still
be present.The cil->xc_ctx_lock read lock is not sufficient to protect the xc_cil
list, however, due to how CIL insertion is handled.
xlog_cil_insert_items() inserts and reorders the dirty transaction items
to the tail of the CIL under xc_cil_lock. It uses list_move_tail() to
achieve insertion and reordering in the same block of code. This
function removes and reinserts an item to the tail of the list. If a
transaction commits an item that was already logged and thus already
resides in the CIL, and said item is the sole item on the list, the
removal and reinsertion creates a temporary state where the list is
actually empty.This state is not valid and thus should never be observed by concurrent
transaction commit-side checks in the circumstances outlined above. We
do not want to acquire the xc_cil_lock in all of these instances as it
was previously removed and replaced with a separate push lock for
performance reasons. Therefore, close any races with list_empty() on the
insertion side by ensuring that the list is never in a transient empty
state.Signed-off-by: Brian Foster
Reviewed-by: Dave Chinner
Signed-off-by: Dave Chinner
04 Jun, 2015
3 commits
-
Instead of the confusing flags argument pass a boolean flag to indicate if
we want to release or regrant a log reservation.Also ensure that xfs_log_done always drop the reference on the log ticket,
to both simplify the code and make the logic in xfs_trans_roll easier
to understand.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Dave Chinner -
The flags argument to xfs_trans_commit is not useful for most callers, as
a commit of a transaction without a permanent log reservation must pass
0 here, and all callers for a transaction with a permanent log reservation
except for xfs_trans_roll must pass XFS_TRANS_RELEASE_LOG_RES. So remove
the flags argument from the public xfs_trans_commit interfaces, and
introduce low-level __xfs_trans_commit variant just for xfs_trans_roll
that regrants a log reservation instead of releasing it.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Dave Chinner -
The flags value always was 0 or XFS_TRANS_ABORT. Switch to a bool
parameter to allow further cleanups.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Dave Chinner
28 Nov, 2014
2 commits
-
More on-disk format consolidation.
Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Dave Chinner -
More on-disk format consolidation. A few declarations that weren't on-disk
format related move into better suitable spots.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Dave Chinner