07 Oct, 2020
1 commit
-
Refactor xfs_getfsmap to improve its performance: instead of indirectly
calling a function that copies one record to userspace at a time, create
a shadow buffer in the kernel and copy the whole array once at the end.
On the author's computer, this reduces the runtime on his /home by ~20%.This also eliminates a deadlock when running GETFSMAP against the
realtime device. The current code locks the rtbitmap to create
fsmappings and copies them into userspace, having not released the
rtbitmap lock. If the userspace buffer is an mmap of a sparse file that
itself resides on the realtime device, the write page fault will recurse
into the fs for allocation, which will deadlock on the rtbitmap lock.Fixes: 4c934c7dd60c ("xfs: report realtime space information via the rtbitmap")
Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
Reviewed-by: Chandan Babu R
16 Sep, 2020
2 commits
-
This patch aims to replace kmem_zalloc_large() with global kernel memory
API. So, all its callers are now using kvzalloc() directly, so kmalloc()
fallsback to vmalloc() automatically.Signed-off-by: Carlos Maiolino
Reviewed-by: Dave Chinner
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
Redesign the ondisk inode timestamps to be a simple unsigned 64-bit
counter of nanoseconds since 14 Dec 1901 (i.e. the minimum time in the
32-bit unix time epoch). This enables us to handle dates up to 2486,
which solves the y2038 problem.Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
Reviewed-by: Gao Xiang
Reviewed-by: Dave Chinner
29 Jul, 2020
1 commit
-
1) FS_DAX_FL has been introduced by commit b383a73f2b83.
2) In future, chattr/lsattr command from e2fsprogs can set/get
inode DAX on XFS by calling ioctl(SETXFLAGS/GETXFLAGS).Signed-off-by: Xiao Yang
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
12 Jun, 2020
1 commit
-
Pull DAX updates part three from Darrick Wong:
"Now that the xfs changes have landed, this third piece changes the
FS_XFLAG_DAX ioctl code in xfs to request that the inode be reloaded
after the last program closes the file, if doing so would make a S_DAX
change happen. The goal here is to make dax access mode switching
quicker when possible.Summary:
- Teach XFS to ask the VFS to drop an inode if the administrator
changes the FS_XFLAG_DAX inode flag such that the S_DAX state would
change. This can result in files changing access modes without
requiring an unmount cycle"* tag 'vfs-5.8-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
fs/xfs: Update xfs_ioctl_setattr_dax_invalidate()
fs/xfs: Combine xfs_diflags_to_linux() and xfs_diflags_to_iflags()
fs/xfs: Create function xfs_inode_should_enable_dax()
fs/xfs: Make DAX mount option a tri-state
fs/xfs: Change XFS_MOUNT_DAX to XFS_MOUNT_DAX_ALWAYS
fs/xfs: Remove unnecessary initialization of i_rwsem
30 May, 2020
2 commits
-
Because of the separation of FS_XFLAG_DAX from S_DAX and the delayed
setting of S_DAX, data invalidation no longer needs to happen when
FS_XFLAG_DAX is changed.Change xfs_ioctl_setattr_dax_invalidate() to be
xfs_ioctl_dax_check_set_cache() and alter the code to reflect the new
functionality.Furthermore, we no longer need the locking so we remove the join_flags
logic.Signed-off-by: Ira Weiny
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
The functionality in xfs_diflags_to_linux() and xfs_diflags_to_iflags() are
nearly identical. The only difference is that *_to_linux() is called after
inode setup and disallows changing the DAX flag.Combining them can be done with a flag which indicates if this is the initial
setup to allow the DAX flag to be properly set only at init time.So remove xfs_diflags_to_linux() and call the modified xfs_diflags_to_iflags()
directly.While we are here simplify xfs_diflags_to_iflags() to take struct xfs_inode and
use xfs_ip2xflags() to ensure future diflags are included correctly.Reviewed-by: Dave Chinner
Reviewed-by: Darrick J. Wong
Signed-off-by: Ira Weiny
Signed-off-by: Darrick J. Wong
27 May, 2020
1 commit
-
Move xfs_fs_eofblocks_from_user into the only file that actually uses
it, so that we don't have this function cluttering up the header file.Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
Reviewed-by: Brian Foster
20 May, 2020
1 commit
-
There are there are three extents counters per inode, one for each of
the forks. Two are in the legacy icdinode and one is directly in
struct xfs_inode. Switch to a single counter in the xfs_ifork structure
where it uses up padding at the end of the structure. This simplifies
various bits of code that just wants the number of extents counter and
can now directly dereference it.Signed-off-by: Christoph Hellwig
Reviewed-by: Chandan Babu R
Reviewed-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
05 May, 2020
2 commits
-
The functionality in xfs_diflags_to_linux() and xfs_diflags_to_iflags() are
nearly identical. The only difference is that *_to_linux() is called after
inode setup and disallows changing the DAX flag.Combining them can be done with a flag which indicates if this is the initial
setup to allow the DAX flag to be properly set only at init time.So remove xfs_diflags_to_linux() and call the modified xfs_diflags_to_iflags()
directly.While we are here simplify xfs_diflags_to_iflags() to take struct xfs_inode and
use xfs_ip2xflags() to ensure future diflags are included correctly.Reviewed-by: Dave Chinner
Reviewed-by: Darrick J. Wong
Signed-off-by: Ira Weiny
Signed-off-by: Darrick J. Wong -
The initial value of variable udqp is NULL, and we only set the
flag XFS_QMOPT_PQUOTA in xfs_qm_vop_dqalloc() function, so only
the pdqp value is initialized and the udqp value is still NULL.
Since the udqp value is NULL in the rest part of xfs_ioctl_setattr()
function, it is meaningless and do nothing. So remove it from
xfs_ioctl_setattr().Signed-off-by: Kaixu Xia
Reviewed-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
13 Apr, 2020
1 commit
-
The filesystem freeze sequence in XFS waits on any background
eofblocks or cowblocks scans to complete before the filesystem is
quiesced. At this point, the freezer has already stopped the
transaction subsystem, however, which means a truncate or cowblock
cancellation in progress is likely blocked in transaction
allocation. This results in a deadlock between freeze and the
associated scanner.Fix this problem by holding superblock write protection across calls
into the block reapers. Since protection for background scans is
acquired from the workqueue task context, trylock to avoid a similar
deadlock between freeze and blocking on the write lock.Fixes: d6b636ebb1c9f ("xfs: halt auto-reclamation activities while rebuilding rmap")
Reported-by: Paul Furtado
Signed-off-by: Brian Foster
Reviewed-by: Chandan Rajendra
Reviewed-by: Christoph Hellwig
Reviewed-by: Allison Collins
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
19 Mar, 2020
2 commits
-
We know the version is 3 if on a v5 file system. For earlier file
systems formats we always upgrade the remaining v1 inodes to v2 and
thus only use v2 inodes. Use the xfs_sb_version_has_large_dinode
helper to check if we deal with small or large dinodes, and thus
remove the need for the di_version field in struct icdinode.Signed-off-by: Christoph Hellwig
Reviewed-by: Brian Foster
Reviewed-by: Chandan Rajendra
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
Only v5 file systems can have the reflink feature, and those will
always use the large dinode format. Remove the extra check for the
inode version.Signed-off-by: Christoph Hellwig
Reviewed-by: Brian Foster
Reviewed-by: Chandan Rajendra
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
03 Mar, 2020
19 commits
-
Let the low-level attr code only allocate the needed buffer size
for xfs_attrmulti_attr_get instead of allocating the upper bound
at the top of the call chain.Suggested-by: Dave Chinner
Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
Use the round_down macro, and use the size of the uint32 type we
use in the callback that fills the buffer to make the code a little
more clear - the size of it is always the same as int for platforms
that Linux runs on.Suggested-by: Dave Chinner
Reviewed-by: Dave Chinner
Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
The attrlist cursor only exists as part of an attr list context, so
embedd the structure instead of pointing to it. Also give it a proper
xfs_ prefix and remove the obsolete typedef.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Chandan Rajendra
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
The ATTR_* flags have a long IRIX history, where they a userspace
interface, the on-disk format and an internal interface. We've split
out the on-disk interface to the XFS_ATTR_* values, but despite (or
because?) of that the flag have still been a mess. Switch the
internal interface to pass the on-disk XFS_ATTR_* flags for the
namespace and the Linux XATTR_* flags for the actual flags instead.
The ATTR_* values that are actually used are move to xfs_fs.h with a
new XFS_IOC_* prefix to not conflict with the userspace version that
has the same name and must have the same value.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Chandan Rajendra
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
Move the function to xfs_acl.c and provide a proper stub for the
!CONFIG_XFS_POSIX_ACL case. Lift the flags check to the caller as it
nicely fits in there.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Chandan Rajendra
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
Lift the common code to copy the cursor from and to user space into
xfs_ioc_attr_list. Note that this means we copy in twice now as
the cursor is in the middle of the conaining structure, but we never
touch the memory for the original copy. Doing so keeps the cursor
handling isolated in the common helper.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Chandan Rajendra
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
Lift the buffer allocation from the two callers into xfs_ioc_attr_list.
Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Chandan Rajendra
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
Lift the flags and bufsize checks from both callers into the common code
in xfs_ioc_attr_list.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Chandan Rajendra
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
The version taking the context structure is the main interface to list
attributes, so drop the _int postfix.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Chandan Rajendra
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
The old xfs_attr_list code is only used by the attrlist by handle
ioctl. Move it to xfs_ioctl.c with its user. Also move the
attrlist and attrlist_ent structure to xfs_fs.h, as they are exposed
user ABIs. They are used through libattr headers with the same name
by at least xfsdump. Also document this relation so that it doesn't
require a research project to figure out.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Chandan Rajendra
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
op_flags with the XFS_DA_OP_* flags is the usual place for in-kernel
only flags, so move the notime flag there.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Chandan Rajendra
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
Instead of converting from one style of arguments to another in
xfs_attr_set, pass the structure from higher up in the call chain.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Chandan Rajendra
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
Instead of converting from one style of arguments to another in
xfs_attr_set, pass the structure from higher up in the call chain.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Chandan Rajendra
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
Add a new helper to handle a single attr multi ioctl operation that
can be shared between the native and compat ioctl implementation.There is a slight change in behaviour in that we don't break out of the
loop when copying in the attribute name fails. The previous behaviour
was rather inconsistent here as it continued for any other kind of
error, and that we don't clear the flags in the structure returned
to userspace, a behavior only introduced as a bug fix in the last
merge window.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Chandan Rajendra
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
Simplify the user copy code by using strndup_user. This means that we
now do one memory allocation per operation instead of one per ioctl,
but memory allocations are cheap compared to the actual file system
operations. Also the error for an invalid path is now EINVAL or EFAULT
instead of the previous odd and undocumented ERANGE.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Chandan Rajendra
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
Merge the ioctl handlers just like the low-level xfs_attr_set function.
Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Chandan Rajendra
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
The Linux xattr and acl APIs use a single call for set and remove.
Modify the high-level XFS API to match that and let xfs_attr_set handle
removing attributes as well. With a little bit of reordering this
removes a lot of code.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Chandan Rajendra
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
While the flags field in the ABI and the on-disk format allows for
multiple namespace flags, an attribute can only exist in a single
namespace at a time. Hence asking to list attributes that exist
in multiple namespaces simultaneously is a logically invalid
request and will return no results. Reject this case early with
-EINVAL.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Chandan Rajendra
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
Use the Linux inode i_uid/i_gid members everywhere and just convert
from/to the scalar value when reading or writing the on-disk inode.Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
10 Jan, 2020
3 commits
-
This helps to pre-simplify the extra handling of the null terminator in
delayed operations which use memcpy rather than strlen. Later
when we introduce parent pointers, attribute names will become binary,
so strlen will not work at all. Removing uses of strlen now will
help reduce complexities laterSigned-off-by: Allison Collins
Reviewed-by: Darrick J. Wong
Reviewed-by: Brian Foster
Reviewed-by: Christoph Hellwig
Signed-off-by: Darrick J. Wong -
While the flags field in the ABI and the on-disk format allows for
multiple namespace flags, that is a logically invalid combination that
scrub complains about. Reject it at the ioctl level, as all other
interface already get this right at higher levels.Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
Don't allow passing arbitrary flags as they change behavior including
memory allocation that the call stack is not prepared for.Fixes: ddbca70cc45c ("xfs: allocate xattr buffer on demand")
Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
14 Nov, 2019
2 commits
-
Thes ioctls set DMAPI specific flags in the on-disk inode, but there is
no way to actually ever query those flags. The only known user is
xfsrestore with the -D option, which is documented to be only useful
inside a DMAPI enviroment, which isn't supported by upstream XFS.Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong -
There is no point in splitting the fields like this in an purely
in-memory structure.Signed-off-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
08 Nov, 2019
1 commit
-
Some of the xfs source files are missing header includes, so add them
back. Sparse complains about non-static functions that don't have a
forward declaration anywhere.Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
01 Nov, 2019
1 commit
-
AIO+DIO can extend the file size on IO completion, and it holds
no inode locks while the IO is in flight. Therefore, a race
condition exists in file size updates if we do something like this:aio-thread fallocate-thread
lock inode
submit IO beyond inode->i_size
unlock inode
.....
lock inode
break layouts
if (off + len > inode->i_size)
new_size = off + len
.....
inode_dio_wait()
.....
completes
inode->i_size updated
inode_dio_done()
....
if (new_size)
xfs_vn_setattr(inode, new_size)Yup, that attempt to extend the file size in the fallocate code
turns into a truncate - it removes the whatever the aio write
allocated and put to disk, and reduced the inode size back down to
where the fallocate operation ends.Fundamentally, xfs_file_fallocate() not compatible with racing
AIO+DIO completions, so we need to move the inode_dio_wait() call
up to where the lock the inode and break the layouts.Secondly, storing the inode size and then using it unchecked without
holding the ILOCK is not safe; we can only do such a thing if we've
locked out and drained all IO and other modification operations,
which we don't do initially in xfs_file_fallocate.It should be noted that some of the fallocate operations are
compound operations - they are made up of multiple manipulations
that may zero data, and so we may need to flush and invalidate the
file multiple times during an operation. However, we only need to
lock out IO and other space manipulation operations once, as that
lockout is maintained until the entire fallocate operation has been
completed.Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong