16 Nov, 2012
3 commits
-
To separate the verifiers from iodone functions and associate read
and write verifiers at the same time, introduce a buffer verifier
operations structure to the xfs_buf.This avoids the need for assigning the write verifier, clearing the
iodone function and re-running ioend processing in the read
verifier, and gets rid of the nasty "b_pre_io" name for the write
verifier function pointer. If we ever need to, it will also be
easier to add further content specific callbacks to a buffer with an
ops structure in place.We also avoid needing to export verifier functions, instead we
can simply export the ops structures for those that are needed
outside the function they are defined in.This patch also fixes a directory block readahead verifier issue
it exposed.This patch also adds ops callbacks to the inode/alloc btree blocks
initialised by growfs. These will need more work before they will
work with CRCs.Signed-off-by: Dave Chinner
Reviewed-by: Phil White
Signed-off-by: Ben Myers -
These verifiers are essentially the same code as the read verifiers,
but do not require ioend processing. Hence factor the read verifier
functions and add a new write verifier wrapper that is used as the
callback.This is done as one large patch for all verifiers rather than one
patch per verifier as the change is largely mechanical. This
includes hooking up the write verifier via the read verifier
function.Hooking up the write verifier for buffers obtained via
xfs_trans_get_buf() will be done in a separate patch as that touches
code in many different places rather than just the verifier
functions.Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers -
Add an btree block verify callback function and pass it into the
buffer read functions. Because each different btree block type
requires different verification, add a function to the ops structure
that is called from the generic code.Also, propagate the verification callback functions through the
readahead functions, and into the external bmap and bulkstat inode
readahead code that uses the generic btree buffer read functions.Signed-off-by: Dave Chinner
Reviewed-by: Phil White
Signed-off-by: Ben Myers
18 Oct, 2012
1 commit
-
The inode cache functions remaining in xfs_iget.c can be moved to xfs_icache.c
along with the other inode cache functions. This removes all functionality from
xfs_iget.c, so the file can simply be removed.This move results in various functions now only having the scope of a single
file (e.g. xfs_inode_free()), so clean up all the definitions and exported
prototypes in xfs_icache.[ch] and xfs_inode.h appropriately.Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers
22 Jul, 2012
1 commit
-
All callers of xfs_imap_to_bp want the dinode pointer, so let's calculate it
inside xfs_imap_to_bp. Once that is done xfs_itobp becomes a fairly pointless
wrapper which can be replaced with direct calls to xfs_imap_to_bp.Signed-off-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers
15 May, 2012
1 commit
-
With the removal of xfs_rw.h and other changes over time, xfs_bit.h
is being included in many files that don't actually need it. Clean
up the includes as necessary.Also move the only-used-once xfs_ialloc_find_free() static inline
function out of a header file that is widely included to reduce
the number of needless dependencies on xfs_bit.h.Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers
27 Mar, 2012
1 commit
-
When we read inodes via bulkstat, we generally only read them once
and then throw them away - they never get used again. If we retain
them in cache, then it simply causes the working set of inodes and
other cached items to be reclaimed just so the inode cache can grow.Avoid this problem by marking inodes read by bulkstat not to be
cached and check this flag in .drop_inode to determine whether the
inode should be added to the VFS LRU or not. If the inode lookup
hits an already cached inode, then don't set the flag. If the inode
lookup hits an inode marked with no cache flag, remove the flag and
allow it to be cached once the current reference goes away.Inodes marked as not cached will get cleaned up by the background
inode reclaim or via memory pressure, so they will still generate
some short term cache pressure. They will, however, be reclaimed
much sooner and in preference to cache hot inodes.Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Signed-off-by: Ben Myers
14 Mar, 2012
1 commit
-
Timestamps on regular files are the last metadata that XFS does not update
transactionally. Now that we use the delaylog mode exclusively and made
the log scode scale extremly well there is no need to bypass that code for
timestamp updates. Logging all updates allows to drop a lot of code, and
will allow for further performance improvements later on.Note that this patch drops optimized handling of fdatasync - it will be
added back in a separate commit.Reviewed-by: Dave Chinner
Signed-off-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers
08 Apr, 2011
1 commit
-
GCC 4.6 now warnings about variables set but not used. Fix the trivially
fixable warnings of this sort.Signed-off-by: Christoph Hellwig
Signed-off-by: Alex Elder
19 Oct, 2010
1 commit
-
This patch adds support for 32bit project quota identifiers.
On disk format is backward compatible with 16bit projid numbers. projid
on disk is now kept in two 16bit values - di_projid_lo (which holds the
same position as old 16bit projid value) and new di_projid_hi (takes
existing padding) and converts from/to 32bit value on the fly.xfs_admin (for existing fs), mkfs.xfs (for new fs) needs to be used
to enable PROJID32BIT support.Signed-off-by: Arkadiusz Miśkiewicz
Reviewed-by: Christoph Hellwig
Signed-off-by: Alex Elder
27 Jul, 2010
3 commits
-
xfs_iput is just a small wrapper for xfs_iunlock + IRELE. Having this
out of line wrapper means the trace events in those two can't track
their caller properly. So just remove the wrapper and opencode the
unlock + rele in the few callers.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner -
Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner -
Dmapi support was never merged upstream, but we still have a lot of hooks
bloating XFS for it, all over the fast pathes of the filesystem.This patch drops over 700 lines of dmapi overhead. If we'll ever get HSM
support in mainline at least the namespace events can be done much saner
in the VFS instead of the individual filesystem, so it's not like this
is much help for future work.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
24 Jun, 2010
2 commits
-
The block number comes from bulkstat based inode lookups to shortcut
the mapping calculations. We ar enot able to trust anything from
bulkstat, so drop the block number as well so that the correct
lookups and mappings are always done.Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig -
Inode numbers may come from somewhere external to the filesystem
(e.g. file handles, bulkstat information) and so are inherently
untrusted. Rename the flag we use for these lookups to make it
obvious we are doing a lookup of an untrusted inode number and need
to verify it completely before trying to read it from disk.Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
23 Jun, 2010
1 commit
-
The non-coherent bulkstat versionsthat look directly at the inode
buffers causes various problems with performance optimizations that
make increased use of just logging inodes. This patch makes bulkstat
always use iget, which should be fast enough for normal use with the
radix-tree based inode cache introduced a while ago.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
06 Mar, 2010
1 commit
-
So that fsr can attempt to get the fork offset of the temporary
inode it uses the same as the inode it is defragmenting, pass the
fork offset out in the bulkstat information.The bulkstat structure has padding that has always been zeroed, so
userspace can tell if this field is set or not by use of the xattr
present flag and a non-zero value for the fork offset.Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Signed-off-by: Alex Elder
22 Jan, 2010
1 commit
-
We use the KM_LARGE flag to make kmem_alloc and friends use vmalloc
if necessary. As we only need this for a few boot/mount time
allocations just switch to explicit vmalloc calls there.Signed-off-by: Christoph Hellwig
Signed-off-by: Alex Elder
16 Jan, 2010
1 commit
-
The use of an array for the per-ag structures requires reallocation
of the array when growing the filesystem. This requires locking
access to the array to avoid use after free situations, and the
locking is difficult to get right. To avoid needing to reallocate an
array, change the per-ag structures to an allocated object per ag
and index them using a tree structure.The AGs are always densely indexed (hence the use of an array), but
the number supported is 2^32 and lookups tend to be random and hence
indexing needs to scale. A simple choice is a radix tree - it works
well with this sort of index. This change also removes another
large contiguous allocation from the mount/growfs path in XFS.The growing process now needs to change to only initialise the new
AGs required for the extra space, and as such only needs to
exclusively lock the tree for inserts. The rest of the code only
needs to lock the tree while doing lookups, and hence this will
remove all the deadlocks that currently occur on the m_perag_lock as
it is now an innermost lock. The lock is also changed to a spinlock
from a read/write lock as the hold time is now extremely short.To complete the picture, the per-ag structures will need to be
reference counted to ensure that we don't free/modify them while
they are still in use. This will be done in subsequent patch.Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Signed-off-by: Alex Elder
09 Oct, 2009
1 commit
-
This is picking up on Felix's repost of Dave's patch to implement a
.dirty_inode method. We really need this notification because
the VFS keeps writing directly into the inode structure instead
of going through methods to update this state. In addition to
the long-known atime issue we now also have a caller in VM code
that updates c/mtime that way for shared writeable mmaps. And
I found another one that no one has noticed in practice in the FIFO
code.So implement ->dirty_inode to set i_update_core whenever the
inode gets externally dirtied, and switch the c/mtime handling to
the same scheme we already use for atime (always picking up
the value from the Linux inode).Note that this patch also removes the xfs_synchronize_atime call
in xfs_reclaim it was superflous as we already synchronize the time
when writing the inode via the log (xfs_inode_item_format) or the
normal buffers (xfs_iflush_int).In addition also remove the I_CLEAR check before copying the Linux
timestamps - now that we always have the Linux inode available
we can always use the timestamps in it.Also switch to just using file_update_time for regular reads/writes -
that will get us all optimization done to it for free and make
sure we notice early when it breaks.Signed-off-by: Christoph Hellwig
Reviewed-by: Felix Blyakher
Reviewed-by: Alex Elder
Signed-off-by: Alex Elder
02 Sep, 2009
2 commits
-
Currenly we have a xfs_inobt_lookup* variant for each comparism direction,
and all these get all three fields of the inobt records passed, while the
common case is just looking for the inode number and we have only marginally
more callers than xfs_inobt_lookup* variants.So opencode a direct call to xfs_btree_lookup for the single case where we
need all fields, and replace xfs_inobt_lookup* with a xfs_inobt_looku that
just takes the inode number and the direction for all other callers.Signed-off-by: Christoph Hellwig
Reviewed-by: Alex Elder
Signed-off-by: Felix Blyakher -
Most callers of xfs_inobt_get_rec need to fill a xfs_inobt_rec_incore_t, and
those who don't yet are fine with a xfs_inobt_rec_incore_t, instead of the
three individual variables, too. So just change xfs_inobt_get_rec to write
the output into a xfs_inobt_rec_incore_t directly.Signed-off-by: Christoph Hellwig
Reviewed-by: Alex Elder
Signed-off-by: Felix Blyakher
01 Sep, 2009
1 commit
-
A lot more functions could be made static, but they need
forward declarations; this does some easy ones, and also
found a few unused functions in the process.Signed-off-by: Eric Sandeen
Reviewed-by: Christoph Hellwig
Signed-off-by: Felix Blyakher
29 Mar, 2009
1 commit
-
Signed-off-by: Malcolm Parsons
Reviewed-by: Christoph Hellwig
16 Mar, 2009
1 commit
-
Two out of three are unused already, and the third is better done open-coded
with a comment describing what's going on here.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
16 Jan, 2009
1 commit
-
Remove the last of the macros-defined-to-static-functions.
Signed-off-by: Eric Sandeen
Reviewed-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy
02 Dec, 2008
2 commits
-
The 32-bit xfs_blkstat_one handler was failing because
a size check checked whether the remaining (32-bit)
user buffer was less than the (64-bit) bulkstat buffer,
and failed with ENOMEM if so. Move this check
into the respective handlers so that they check the
correct sizes.Also, the formatters were returning negative errors
or positive bytes copied; this was odd in the positive
error value world of xfs, and handled wrong by at least
some of the callers, which treated the bytes returned
as an error value. Move the bytes-used assignment
into the formatters.Signed-off-by: Eric Sandeen
Reviewed-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy -
Currently the compat formatter was handled by passing
in "private_data" for the xfs_bulkstat_one formatter,
which was really just another formatter... IMHO this
got confusing.Instead, just make a new xfs_bulkstat_one_compat
formatter for xfs_bulkstat, and call it via a wrapper.Also, don't translate the ioctl nrs into their native
counterparts, that just clouds the issue; we're in a
compat handler anyway, just switch on the 32-bit cmds.Signed-off-by: Eric Sandeen
Reviewed-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy
01 Dec, 2008
4 commits
-
Just pass down the XFS_IGET_* flags all the way down to xfs_imap instead
of translating them mid-way.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Niv Sardi -
Most uses of struct xfs_imap are to map and inode to a buffer. To avoid
copying around the inode location information we should just embedd a
strcut xfs_imap into the xfs_inode. To make sure it doesn't bloat an
inode the im_len is changed to a ushort, which is fine as that's what
the users exepect anyway.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Niv Sardi -
These names don't add any value at all over just using the numerical
values.(First sent on October 9th)
Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Niv Sardi -
Now that we have a separate xfs_icdinode_t for the in-core inode which
gets logged there is no need anymore for the xfs_dinode vs xfs_dinode_core
split - the fact that part of the structure gets logged through the inode
log item and a small part not can better be described in a comment.All sizeof operations on the dinode_core either really wanted the
icdinode and are switched to that one, or had already added the size
of the agi unlinked list pointer. Later both will be replaced with
helpers once we get the larger CRC-enabled dinode.Removing the data and attribute fork unions also has the advantage that
xfs_dinode.h doesn't need to pull in every header under the sun.While we're at it also add some more comments describing the dinode
structure.(First sent on October 7th)
Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Niv Sardi
30 Oct, 2008
4 commits
-
xfs_bulkstat only wants the dinode, offset and buffer from a given inode
number. Instead of using xfs_itobp on a fake inode which is complicated
and currently leads to leaks of the security data just use xfs_inotobp
which is designed to do exactly the kind of lookup xfs_bulkstat wants. The
only thing that's missing in xfs_inotobp is a flags paramter that let's us
pass down XFS_IMAP_BULKSTAT, but that can easily added.SGI-PV: 987246
SGI-Modid: xfs-linux-melb:xfs-kern:32397a
Signed-off-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy
Signed-off-by: David Chinner -
From: Dave Chinner
Because this is the first major generic btree routine this patch includes
some infrastrucure, first a few routines to deal with a btree block that
can be either in short or long form, second xfs_btree_read_buf_block,
which is the new central routine to read a btree block given a cursor, and
third the new xfs_btree_ptr_addr routine to calculate the address for a
given btree pointer record.[hch: split out from bigger patch and minor adaptions]
SGI-PV: 985583
SGI-Modid: xfs-linux-melb:xfs-kern:32190a
Signed-off-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy
Signed-off-by: Bill O'Donnell
Signed-off-by: David Chinner -
xfs_btree_init_cursor contains close to little shared code for the
different btrees and will get even more non-common code in the future.
Split it up into one routine per btree type.Because xfs_btree_dup_cursor needs to call the init routine for a generic
btree cursor add a new btree operation vector that contains a dup_cursor
method that initializes a new cursor based on an existing one.The btree operations vector is based on an idea and code from Dave Chinner
and will grow more entries later during this series.SGI-PV: 985583
SGI-Modid: xfs-linux-melb:xfs-kern:32176a
Signed-off-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy
Signed-off-by: Bill O'Donnell
Signed-off-by: David Chinner -
To avoid having to initialise some fields of the XFS inode on every
allocation, we can use the slab init-once feature to initialise them. All
we have to guarantee is that when we free the inode, all it's entries are
in the initial state. Add asserts where possible to ensure debug kernels
check this initial state before freeing and after allocation.SGI-PV: 981498
SGI-Modid: xfs-linux-melb:xfs-kern:31925a
Signed-off-by: David Chinner
Signed-off-by: Lachlan McIlroy
Signed-off-by: Christoph Hellwig
13 Aug, 2008
2 commits
-
In various places we can just move a VFS_I call into the argument list of
called functions/macros instead of having a local bhv_vnode_t.SGI-PV: 981498
SGI-Modid: xfs-linux-melb:xfs-kern:31776a
Signed-off-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy -
Replace XFS_ITOV() with the new VFS_I() inline.
SGI-PV: 981498
SGI-Modid: xfs-linux-melb:xfs-kern:31724a
Signed-off-by: David Chinner
Signed-off-by: Niv Sardi
Signed-off-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy
28 Jul, 2008
1 commit
-
kmem_free() function takes (ptr, size) arguments but doesn't actually use
second one.This patch removes size argument from all callsites.
SGI-PV: 981498
SGI-Modid: xfs-linux-melb:xfs-kern:31050aSigned-off-by: Denys Vlasenko
Signed-off-by: David Chinner
Signed-off-by: Lachlan McIlroy
29 Apr, 2008
1 commit
-
Unless XFS_IGET_CREATE is passed xfs_iget will return ENOENT if it
encounters an inode with di_mode == 0. Remove the duplicated checks in the
callers.(the log recovery case is not touched for now)
SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30898aSigned-off-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy