11 Sep, 2013
6 commits
-
This patch adds the missing call to list_lru_destroy (spotted by Li Zhong)
and moves the deletion to after the shrinker is unregistered, as correctly
spotted by DaveSigned-off-by: Glauber Costa
Cc: Michal Hocko
Cc: Dave Chinner
Signed-off-by: Andrew Morton
Signed-off-by: Al Viro -
We currently use a compile-time constant to size the node array for the
list_lru structure. Due to this, we don't need to allocate any memory at
initialization time. But as a consequence, the structures that contain
embedded list_lru lists can become way too big (the superblock for
instance contains two of them).This patch aims at ameliorating this situation by dynamically allocating
the node arrays with the firmware provided nr_node_ids.Signed-off-by: Glauber Costa
Cc: Dave Chinner
Cc: Mel Gorman
Cc: "Theodore Ts'o"
Cc: Adrian Hunter
Cc: Al Viro
Cc: Artem Bityutskiy
Cc: Arve Hjønnevåg
Cc: Carlos Maiolino
Cc: Christoph Hellwig
Cc: Chuck Lever
Cc: Daniel Vetter
Cc: David Rientjes
Cc: Gleb Natapov
Cc: Greg Thelen
Cc: J. Bruce Fields
Cc: Jan Kara
Cc: Jerome Glisse
Cc: John Stultz
Cc: KAMEZAWA Hiroyuki
Cc: Kent Overstreet
Cc: Kirill A. Shutemov
Cc: Marcelo Tosatti
Cc: Mel Gorman
Cc: Steven Whitehouse
Cc: Thomas Hellstrom
Cc: Trond Myklebust
Signed-off-by: Andrew Morton
Signed-off-by: Al Viro -
The new LRU list isolation code in xfs_qm_dquot_isolate() isn't
completely up to date. Firstly, it needs conversion to return enum
lru_status values, not raw numbers. Secondly - most importantly - it
fails to unlock the dquot and relock the LRU in the LRU_RETRY path.
This leads to deadlocks in xfstests generic/232. Fix them.Signed-off-by: Dave Chinner
Cc: Glauber Costa
Cc: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Al Viro -
fix warnings
Cc: Dave Chinner
Cc: Glauber Costa
Signed-off-by: Andrew Morton
Signed-off-by: Al Viro -
Convert the XFS dquot lru to use the list_lru construct and convert the
shrinker to being node aware.[glommer@openvz.org: edited for conflicts + warning fixes]
Signed-off-by: Dave Chinner
Signed-off-by: Glauber Costa
Cc: "Theodore Ts'o"
Cc: Adrian Hunter
Cc: Al Viro
Cc: Artem Bityutskiy
Cc: Arve Hjønnevåg
Cc: Carlos Maiolino
Cc: Christoph Hellwig
Cc: Chuck Lever
Cc: Daniel Vetter
Cc: David Rientjes
Cc: Gleb Natapov
Cc: Greg Thelen
Cc: J. Bruce Fields
Cc: Jan Kara
Cc: Jerome Glisse
Cc: John Stultz
Cc: KAMEZAWA Hiroyuki
Cc: Kent Overstreet
Cc: Kirill A. Shutemov
Cc: Marcelo Tosatti
Cc: Mel Gorman
Cc: Steven Whitehouse
Cc: Thomas Hellstrom
Cc: Trond Myklebust
Signed-off-by: Andrew MortonSigned-off-by: Al Viro
-
The sysctl knob sysctl_vfs_cache_pressure is used to determine which
percentage of the shrinkable objects in our cache we should actively try
to shrink.It works great in situations in which we have many objects (at least more
than 100), because the aproximation errors will be negligible. But if
this is not the case, specially when total_objects < 100, we may end up
concluding that we have no objects at all (total / 100 = 0, if total <
100).This is certainly not the biggest killer in the world, but may matter in
very low kernel memory situations.Signed-off-by: Glauber Costa
Reviewed-by: Carlos Maiolino
Acked-by: KAMEZAWA Hiroyuki
Acked-by: Mel Gorman
Cc: Dave Chinner
Cc: Al Viro
Cc: "Theodore Ts'o"
Cc: Adrian Hunter
Cc: Al Viro
Cc: Artem Bityutskiy
Cc: Arve Hjønnevåg
Cc: Carlos Maiolino
Cc: Christoph Hellwig
Cc: Chuck Lever
Cc: Daniel Vetter
Cc: David Rientjes
Cc: Gleb Natapov
Cc: Greg Thelen
Cc: J. Bruce Fields
Cc: Jan Kara
Cc: Jerome Glisse
Cc: John Stultz
Cc: KAMEZAWA Hiroyuki
Cc: Kent Overstreet
Cc: Kirill A. Shutemov
Cc: Marcelo Tosatti
Cc: Mel Gorman
Cc: Steven Whitehouse
Cc: Thomas Hellstrom
Cc: Trond Myklebust
Signed-off-by: Andrew Morton
Signed-off-by: Al Viro
16 Aug, 2013
1 commit
-
Use uint32 from init_user_ns for xfs internal uid/gid
representation in xfs_icdinode, xfs_dqid_t.Reviewed-by: Dave Chinner
Reviewed-by: Gao feng
Signed-off-by: Dwight Engen
Signed-off-by: Ben Myers
13 Aug, 2013
3 commits
-
With the new xfs_trans_res structure has been introduced, the log
reservation size, log count as well as log flags are pre-initialized
at mount time. So it's time to refine xfs_trans_reserve() interface
to be more neat.Also, introduce a new helper M_RES() to return a pointer to the
mp->m_resv structure to simplify the input.Signed-off-by: Jie Liu
Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers -
There are a few small helper functions in xfs_util, all related to
xfs_inode modifications. Move them all to xfs_inode.c so all
xfs_inode operations are consiolidated in the one place.Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers -
The on disk format definitions of the on-disk dquot, log formats and
quota off log formats are all intertwined with other definitions for
quotas. Separate them out into their own header file so they can
easily be shared with userspace.Signed-off-by: Dave Chinner
Reviewed-by: Brian Foster
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers
23 Jul, 2013
1 commit
-
Start using pquotino and define a macro to check if the
superblock has pquotino.Keep backward compatibilty by alowing mount of older superblock
with no separate pquota inode.Signed-off-by: Chandra Seetharaman
Reviewed-by: Ben Myers
Signed-off-by: Ben Myers
11 Jul, 2013
1 commit
-
Add project quota changes to all the places where group quota field
is used:
* add separate project quota members into various structures
* split project quota and group quotas so that instead of overriding
the group quota members incore, the new project quota members are
used instead
* get rid of usage of the OQUOTA flag incore, in favor of separate
group and project quota flags.
* add a project dquot argument to various functions.Not using the pquotino field from superblock yet.
Signed-off-by: Chandra Seetharaman
Reviewed-by: Ben Myers
Signed-off-by: Ben Myers
29 Jun, 2013
4 commits
-
Remove all incore use of XFS_OQUOTA_ENFD and XFS_OQUOTA_CHKD. Instead,
start using XFS_GQUOTA_.* XFS_PQUOTA_.* counterparts for GQUOTA and
PQUOTA respectively.On-disk copy still uses XFS_OQUOTA_ENFD and XFS_OQUOTA_CHKD.
Read and write of the superblock does the conversion from *OQUOTA*
to *[PG]QUOTA*.Signed-off-by: Chandra Seetharaman
Reviewed-by: Ben Myers
Signed-off-by: Ben Myers -
In preparation for combined pquota/gquota support, for the sake
of readability, do some code cleanup surrounding the affected
code.Signed-off-by: Chandra Seetharaman
Reviewed-by: Ben Myers
Signed-off-by: Ben Myers -
In preparation for combined pquota/gquota support, for the sake
of readability, change the macro to an inline function.Signed-off-by: Chandra Seetharaman
Reviewed-by: Ben Myers
Signed-off-by: Ben Myers -
In preparation for combined pquota/gquota support, define
a new function to check if the given inode is a quota inode.Signed-off-by: Chandra Seetharaman
Reviewed-by: Ben Myers
Signed-off-by: Ben Myers
05 Jun, 2013
1 commit
-
Calculating dquot CRCs when the backing buffer is written back just
doesn't work reliably. There are several places which manipulate
dquots directly in the buffers, and they don't calculate CRCs
appropriately, nor do they always set the buffer up to calculate
CRCs appropriately.Firstly, if we log a dquot buffer (e.g. during allocation) it gets
logged without valid CRC, and so on recovery we end up with a dquot
that is not valid.Secondly, if we recover/repair a dquot, we don't have a verifier
attached to the buffer and hence CRCs are not calculated on the way
down to disk.Thirdly, calculating the CRC after we've changed the contents means
that if we re-read the dquot from the buffer, we cannot verify the
contents of the dquot are valid, as the CRC is invalid.So, to avoid all the dquot CRC errors that are being detected by the
read verifier, change to using the same model as for inodes. That
is, dquot CRCs are calculated and written to the backing buffer at
the time the dquot is flushed to the backing buffer. If we modify
the dquot directly in the backing buffer, calculate the CRC
immediately after the modification is complete. Hence the dquot in
the on-disk buffer should always have a valid CRC.Signed-off-by: Dave Chinner
Reviewed-by: Brian Foster
Reviewed-by: Ben Myers
Signed-off-by: Ben Myers
22 Apr, 2013
1 commit
-
Use the reserved space in struct xfs_dqblk to store a UUID and a crc
for the quota blocks.[dchinner@redhat.com] Add a LSN field and update for current verifier
infrastructure.Signed-off-by: Christoph Hellwig
Signed-off-by: Dave Chinner
Reviewed-by: Ben Myers
Signed-off-by: Ben Myers
23 Mar, 2013
1 commit
-
Modify xfs_qm_adjust_dqlimits() to take the xfs_dquot as a
parameter instead of just the xfs_disk_dquot_t so we can update
in-memory fields if necessary.Signed-off-by: Brian Foster
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers
02 Feb, 2013
1 commit
-
For the transaction that write the incore superblock changes of quota flags
to disk, it would reserve the same log space to clear/reset quota flags
transaction, hence we can use XFS_TRANS_SBCHANGE_LOG_RES() for it as well.Signed-off-by: Jie Liu
CC: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers
30 Nov, 2012
1 commit
-
When we fail to get a dquot lock during reclaim, we jump to an error
handler that unlocks the dquot. This is wrong as we didn't lock the
dquot, and unlocking it means who-ever is holding the lock has had
it silently taken away, and hence it results in a lock imbalance.Found by inspection while modifying the code for the numa-lru
patchset. This fixes a random hang I've been seeing on xfstest 232
for the past several months.cc:
Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Signed-off-by: Ben Myers
16 Nov, 2012
4 commits
-
To separate the verifiers from iodone functions and associate read
and write verifiers at the same time, introduce a buffer verifier
operations structure to the xfs_buf.This avoids the need for assigning the write verifier, clearing the
iodone function and re-running ioend processing in the read
verifier, and gets rid of the nasty "b_pre_io" name for the write
verifier function pointer. If we ever need to, it will also be
easier to add further content specific callbacks to a buffer with an
ops structure in place.We also avoid needing to export verifier functions, instead we
can simply export the ops structures for those that are needed
outside the function they are defined in.This patch also fixes a directory block readahead verifier issue
it exposed.This patch also adds ops callbacks to the inode/alloc btree blocks
initialised by growfs. These will need more work before they will
work with CRCs.Signed-off-by: Dave Chinner
Reviewed-by: Phil White
Signed-off-by: Ben Myers -
These verifiers are essentially the same code as the read verifiers,
but do not require ioend processing. Hence factor the read verifier
functions and add a new write verifier wrapper that is used as the
callback.This is done as one large patch for all verifiers rather than one
patch per verifier as the change is largely mechanical. This
includes hooking up the write verifier via the read verifier
function.Hooking up the write verifier for buffers obtained via
xfs_trans_get_buf() will be done in a separate patch as that touches
code in many different places rather than just the verifier
functions.Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers -
Add a dquot buffer verify callback function and pass it into the
buffer read functions. This checks all the dquots in a buffer, but
cannot completely verify the dquot ids are correct. Also, errors
cannot be repaired, so an additional function is added to repair bad
dquots in the buffer if such an error is detected in a context where
repair is allowed.Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Phil White
Signed-off-by: Ben Myers -
Add a verifier function callback capability to the buffer read
interfaces. This will be used by the callers to supply a function
that verifies the contents of the buffer when it is read from disk.
This patch does not provide callback functions, but simply modifies
the interfaces to allow them to be called.The reason for adding this to the read interfaces is that it is very
difficult to tell fom the outside is a buffer was just read from
disk or whether we just pulled it out of cache. Supplying a callbck
allows the buffer cache to use it's internal knowledge of the buffer
to execute it only when the buffer is read from disk.It is intended that the verifier functions will mark the buffer with
an EFSCORRUPTED error when verification fails. This allows the
reading context to distinguish a verification error from an IO
error, and potentially take further actions on the buffer (e.g.
attempt repair) based on the error reported.Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Phil White
Signed-off-by: Ben Myers
18 Oct, 2012
1 commit
-
The inode cache functions remaining in xfs_iget.c can be moved to xfs_icache.c
along with the other inode cache functions. This removes all functionality from
xfs_iget.c, so the file can simply be removed.This move results in various functions now only having the scope of a single
file (e.g. xfs_inode_free()), so clean up all the definitions and exported
prototypes in xfs_icache.[ch] and xfs_inode.h appropriately.Signed-off-by: Dave Chinner
Reviewed-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers
15 Jun, 2012
1 commit
-
XFS_MAXIOFFSET() is just a simple macro that resolves to
mp->m_maxioffset. It doesn't need to exist, and it just makes the
code unnecessarily loud and shouty.Make it quiet and easy to read.
Signed-off-by: Dave Chinner
Reviewed-by: Eric Sandeen
Signed-off-by: Ben Myers
15 May, 2012
4 commits
-
Untangle the header file includes a bit by moving the definition of
xfs_agino_t to xfs_types.h. This removes the dependency that xfs_ag.h has on
xfs_inum.h, meaning we don't need to include xfs_inum.h everywhere we include
xfs_ag.h.Signed-off-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers -
Queue delwri buffers on a local on-stack list instead of a per-buftarg one,
and write back the buffers per-process instead of by waking up xfsbufd.This is now easily doable given that we have very few places left that write
delwri buffers:- log recovery:
Only done at mount time, and already forcing out the buffers
synchronously using xfs_flush_buftarg- quotacheck:
Same story.- dquot reclaim:
Writes out dirty dquots on the LRU under memory pressure. We might
want to look into doing more of this via xfsaild, but it's already
more optimal than the synchronous inode reclaim that writes each
buffer synchronously.- xfsaild:
This is the main beneficiary of the change. By keeping a local list
of buffers to write we reduce latency of writing out buffers, and
more importably we can remove all the delwri list promotions which
were hitting the buffer cache hard under sustained metadata loads.The implementation is very straight forward - xfs_buf_delwri_queue now gets
a new list_head pointer that it adds the delwri buffers to, and all callers
need to eventually submit the list using xfs_buf_delwi_submit or
xfs_buf_delwi_submit_nowait. Buffers that already are on a delwri list are
skipped in xfs_buf_delwri_queue, assuming they already are on another delwri
list. The biggest change to pass down the buffer list was done to the AIL
pushing. Now that we operate on buffers the trylock, push and pushbuf log
item methods are merged into a single push routine, which tries to lock the
item, and if possible add the buffer that needs writeback to the buffer list.
This leads to much simpler code than the previous split but requires the
individual IOP_PUSH instances to unlock and reacquire the AIL around calls
to blocking routines.Given that xfsailds now also handle writing out buffers, the conditions for
log forcing and the sleep times needed some small changes. The most
important one is that we consider an AIL busy as long we still have buffers
to push, and the other one is that we do increment the pushed LSN for
buffers that are under flushing at this moment, but still count them towards
the stuck items for restart purposes. Without this we could hammer on stuck
items without ever forcing the log and not make progress under heavy random
delete workloads on fast flash storage devices.[ Dave Chinner:
- rebase on previous patches.
- improved comments for XBF_DELWRI_Q handling
- fix XBF_ASYNC handling in queue submission (test 106 failure)
- rename delwri submit function buffer list parameters for clarity
- xfs_efd_item_push() should return XFS_ITEM_PINNED ]Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers -
Instead of writing the buffer directly from inside xfs_qm_dqflush return it
to the caller and let the caller decide what to do with the buffer. Also
remove the pincount check in xfs_qm_dqflush that all non-blocking callers
already implement and the now unused flags parameter and the XFS_DQ_IS_DIRTY
check that all callers already perform.[ Dave Chinner: fixed build error cause by missing '{'. ]
Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Reviewed-by: Mark Tinguely
Signed-off-by: Ben Myers -
Check if we actually need to attach a dquot before taking the ilock in
xfs_qm_dqattach. This avoid superflous lock roundtrips for the common cases
of quota support compiled in but not activated on a filesystem and an
inode that already has the dquots attached.Signed-off-by: Christoph Hellwig
Reviewed-by: Mark Tinguely
Reviewed-by: Dave Chinner
Signed-off-by: Ben Myers
15 Mar, 2012
5 commits
-
If we initialize the slab caches for the quota code when XFS is loaded there
is no need for a global and reference counted quota manager structure. Drop
all this overhead and also fix the error handling during quota initialization.Reviewed-by: Dave Chinner
Signed-off-by: Christoph Hellwig
Signed-off-by: Ben Myers -
Instead of keeping a separate per-filesystem list of dquots we can walk
the radix tree for the two places where we need to iterate all quota
structures.Reviewed-by: Dave Chinner
Signed-off-by: Christoph Hellwig
Signed-off-by: Ben Myers -
Replace the global hash tables for looking up in-memory dquot structures
with per-filesystem radix trees to allow scaling to a large number of
in-memory dquot structures.Reviewed-by: Dave Chinner
Signed-off-by: Christoph Hellwig
Signed-off-by: Ben Myers -
Replace the global dquot lru lists with a per-filesystem one.
Note that the shrinker isn't wire up to the per-superblock VFS shrinker
infrastructure as would have problems summing up and splitting the counts
for inodes and dquots. I don't think this is a major problem as the quota
cache isn't as interwinded with the inode cache as the dentry cache is,
because an inode that is dropped from the cache will generally release
a dquot reference, but most of the time it won't be the last one.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Ben Myers -
Switch the quota code over to use the generic XFS statistics infrastructure.
While the legacy /proc/fs/xfs/xqm and /proc/fs/xfs/xqmstats interfaces are
preserved for now the statistics that still have a meaning with the current
code are now also available from /proc/fs/xfs/stats.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Ben Myers
11 Feb, 2012
1 commit
-
Stop reusing dquots from the freelist when allocating new ones directly, and
implement a shrinker that actually follows the specifications for the
interface. The shrinker implementation is still highly suboptimal at this
point, but we can gradually work on it.This also fixes an bug in the previous lock ordering, where we would take
the hash and dqlist locks inside of the freelist lock against the normal
lock ordering. This is only solvable by introducing the dispose list,
and thus not when using direct reclaim of unused dquots for new allocations.As a side-effect the quota upper bound and used to free ratio values in
/proc/fs/xfs/xqm are set to 0 as these values don't make any sense in the
new world order.Signed-off-by: Christoph Hellwig
Signed-off-by: Ben Myers(cherry picked from commit 04da0c8196ac0b12fb6b84f4b7a51ad2fa56d869)
04 Feb, 2012
1 commit
-
Define new macro XFS_ALL_QUOTA_ACTIVE and simply some usage
of quota macros.Signed-off-by: Chandra Seetharaman
Reviewed-by: Christoph Hellwig
Signed-off-by: Ben Myers
17 Dec, 2011
1 commit
-
There is no reason to drop qi_dqlist_lock around calls to xfs_qm_dqrele
because the free list lock now nests inside qi_dqlist_lock and the
dquot lock.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Ben Myers
16 Dec, 2011
1 commit
-
Just read the id 0 dquot from disk directly in xfs_qm_init_quotainfo instead
of going through dqget and requiring a special flag to not add the dquot to
any lists.Signed-off-by: Christoph Hellwig
Reviewed-by: Dave Chinner
Signed-off-by: Ben Myers