18 Dec, 2019
1 commit
-
Anything that walks all inodes on sb->s_inodes list without rescheduling
risks softlockups.Previous efforts were made in 2 functions, see:
c27d82f fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()
ac05fbb inode: don't softlockup when evicting inodesbut there hasn't been an audit of all walkers, so do that now. This
also consistently moves the cond_resched() calls to the bottom of each
loop in cases where it already exists.One loop remains: remove_dquot_ref(), because I'm not quite sure how
to deal with that one w/o taking the i_lock.Signed-off-by: Eric Sandeen
Reviewed-by: Jan Kara
Signed-off-by: Al Viro
07 Dec, 2019
1 commit
-
Pull vfs d_inode/d_flags memory ordering fixes from Al Viro:
"Fallout from tree-wide audit for ->d_inode/->d_flags barriers use.
Basically, the problem is that negative pinned dentries require
careful treatment - unless ->d_lock is locked or parent is held at
least shared, another thread can make them positive right under us.Most of the uses turned out to be safe - the main surprises as far as
filesystems are concerned were- race in dget_parent() fastpath, that might end up with the caller
observing the returned dentry _negative_, due to insufficient
barriers. It is positive in memory, but we could end up seeing the
wrong value of ->d_inode in CPU cache. Fixed.- manual checks that result of lookup_one_len_unlocked() is positive
(and rejection of negatives). Again, insufficient barriers (we
might end up with inconsistent observed values of ->d_inode and
->d_flags). Fixed by switching to a new primitive that does the
checks itself and returns ERR_PTR(-ENOENT) instead of a negative
dentry. That way we get rid of boilerplate converting negatives
into ERR_PTR(-ENOENT) in the callers and have a single place to
deal with the barrier-related mess - inside fs/namei.c rather than
in every caller out there.The guts of pathname resolution *do* need to be careful - the race
found by Ritesh is real, as well as several similar races.
Fortunately, it turns out that we can take care of that with fairly
local changes in there.The tree-wide audit had not been fun, and I hate the idea of repeating
it. I think the right approach would be to annotate the places where
we are _not_ guaranteed ->d_inode/->d_flags stability and have sparse
catch regressions. But I'm still not sure what would be the least
invasive way of doing that and it's clearly the next cycle fodder"* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs/namei.c: fix missing barriers when checking positivity
fix dget_parent() fastpath race
new helper: lookup_positive_unlocked()
fs/namei.c: pull positivity check into follow_managed()
16 Nov, 2019
1 commit
-
Most of the callers of lookup_one_len_unlocked() treat negatives are
ERR_PTR(-ENOENT). Provide a helper that would do just that. Note
that a pinned positive dentry remains positive - it's ->d_inode is
stable, etc.; a pinned _negative_ dentry can become positive at any
point as long as you are not holding its parent at least shared.
So using lookup_one_len_unlocked() needs to be careful;
lookup_positive_unlocked() is safer and that's what the callers
end up open-coding anyway.Signed-off-by: Al Viro
11 Nov, 2019
1 commit
-
Quota statistics counted as 64-bit per-cpu counter. Reading sums per-cpu
fractions as signed 64-bit int, filters negative values and then reports
lower half as signed 32-bit int.Result may looks like:
fs.quota.allocated_dquots = 22327
fs.quota.cache_hits = -489852115
fs.quota.drops = -487288718
fs.quota.free_dquots = 22083
fs.quota.lookups = -486883485
fs.quota.reads = 22327
fs.quota.syncs = 335064
fs.quota.writes = 3088689Values bigger than 2^31-1 reported as negative.
All counters except "allocated_dquots" and "free_dquots" are monotonic,
thus they should be reported as is without filtering negative values.Kernel doesn't have generic helper for 64-bit sysctl yet,
let's use at least unsigned long.Link: https://lore.kernel.org/r/157337934693.2078.9842146413181153727.stgit@buzz
Signed-off-by: Konstantin Khlebnikov
Signed-off-by: Jan Kara
06 Nov, 2019
1 commit
04 Nov, 2019
6 commits
-
Make dquot_get_state() gracefully handle a situation when there are no
quota files present even though quotas are enabled.Signed-off-by: Jan Kara
-
Quota on and quota off are protected by s_umount semaphore held in
exclusive mode since commit 7d6cd73d33b6 "quota: Hold s_umount in
exclusive mode when enabling / disabling quotas". This makes it
impossible for dquot_disable() to race with other enabling or disabling
of quotas. Simplify the cleanup done by dquot_disable() based on this
fact and also remove some stale comments. As a bonus this cleanup makes
dquot_disable() properly handle a case when there are no quota inodes.Signed-off-by: Jan Kara
-
Now dquot_enable() has only two internal callers and both of them just
need to update quota flags and don't need most of checks. Just drop
dquot_enable() and fold necessary functionality into the two calling
places.Signed-off-by: Jan Kara
-
Rename vfs_load_quota_inode() to dquot_load_quota_inode() to be
consistent with naming of other functions used for enabling quota
accounting from filesystems. Also export the function and add some
sanity checks to assure filesystems are calling the function properly.Signed-off-by: Jan Kara
-
We already have quota inode loaded when resuming quotas. Use
vfs_load_quota() to avoid some pointless churn with the quota inode.Signed-off-by: Jan Kara
-
Factor out setting up of quota inode and eventual error cleanup from
vfs_load_quota_inode(). This will simplify situation for filesystems
that don't have any quota inodes.Signed-off-by: Jan Kara
01 Nov, 2019
2 commits
-
There is a race window where quota was redirted once we drop dq_list_lock inside dqput(),
but before we grab dquot->dq_lock inside dquot_release()TASK1 TASK2 (chowner)
->dqput()
we_slept:
spin_lock(&dq_list_lock)
if (dquot_dirty(dquot)) {
spin_unlock(&dq_list_lock);
dquot->dq_sb->dq_op->write_dquot(dquot);
goto we_slept
if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) {
spin_unlock(&dq_list_lock);
dquot->dq_sb->dq_op->release_dquot(dquot);
dqget()
mark_dquot_dirty()
dqput()
goto we_slept;
}
So dquot dirty quota will be released by TASK1, but on next we_sleept loop
we detect this and call ->write_dquot() for it.
XFSTEST: https://github.com/dmonakhov/xfstests/commit/440a80d4cbb39e9234df4d7240aee1d551c36107Link: https://lore.kernel.org/r/20191031103920.3919-2-dmonakhov@openvz.org
CC: stable@vger.kernel.org
Signed-off-by: Dmitry Monakhov
Signed-off-by: Jan Kara -
Write only quotas which are dirty at entry.
XFSTEST: https://github.com/dmonakhov/xfstests/commit/b10ad23566a5bf75832a6f500e1236084083cddc
Link: https://lore.kernel.org/r/20191031103920.3919-1-dmonakhov@openvz.org
CC: stable@vger.kernel.org
Signed-off-by: Konstantin Khlebnikov
Signed-off-by: Dmitry Monakhov
Signed-off-by: Jan Kara
04 Oct, 2019
2 commits
-
Code cleanup for hash bits calculation by
calling ilog2().Link: https://lore.kernel.org/r/20190923135223.27674-1-cgxu519@zoho.com.cn
Signed-off-by: Chengguang Xu
Signed-off-by: Jan Kara -
It is meaningless to increase DQST_LOOKUPS number while iterating
over dirty/inuse list, so just avoid it.Link: https://lore.kernel.org/r/20190926083408.4269-1-cgxu519@zoho.com.cn
Signed-off-by: Chengguang Xu
Signed-off-by: Jan Kara
31 Jul, 2019
1 commit
-
We reset time limit when current usage is smaller
or equal to soft limit in other place, so follow
this rule in do_set_dqblk().Signed-off-by: Chengguang Xu
Link: https://lore.kernel.org/r/20190724053216.19392-1-cgxu519@zoho.com.cn
Signed-off-by: Jan Kara
11 Jul, 2019
1 commit
-
Pull ext2, udf and quota updates from Jan Kara:
- some ext2 fixes and cleanups
- a fix of udf bug when extending files
- a fix of quota Q_XGETQSTAT[V] handling
* tag 'for_v5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: Fix incorrect final NOT_ALLOCATED (hole) extent length
ext2: Use kmemdup rather than duplicating its implementation
quota: honor quota type in Q_XGETQSTAT[V] calls
ext2: Always brelse bh on failure in ext2_iget()
ext2: add missing brelse() in ext2_iget()
ext2: Fix a typo in ext2_getattr argument
ext2: fix a typo in comment
ext2: add missing brelse() in ext2_new_inode()
ext2: optimize ext2_xattr_get()
ext2: introduce new helper for xattr entry comparison
ext2: merge xattr next entry check to ext2_xattr_entry_valid()
ext2: code cleanup for ext2_preread_inode()
ext2: code cleanup by using test_opt() and clear_opt()
doc: ext2: update description of quota options for ext2
ext2: Strengthen xattr block checks
ext2: Merge loops in ext2_xattr_set()
ext2: introduce helper for xattr entry validation
ext2: introduce helper for xattr header validation
quota: add dqi_dirty_list description to comment of Dquot List Management
19 Jun, 2019
1 commit
-
Run below script as root, dquot_add_space will return -EDQUOT since
__dquot_transfer call dquot_add_space with flags=0, and dquot_add_space
think it's a preallocation. Fix it by set flags as DQUOT_SPACE_WARN.mkfs.ext4 -O quota,project /dev/vdb
mount -o prjquota /dev/vdb /mnt
setquota -P 23 1 1 0 0 /dev/vdb
dd if=/dev/zero of=/mnt/test-file bs=4K count=1
chattr -p 23 test-fileFixes: 7b9ca4c61bc2 ("quota: Reduce contention on dq_data_lock")
Signed-off-by: yangerkun
Signed-off-by: Jan Kara
20 May, 2019
1 commit
-
Actually there are four lists for dquot management, so add
the description of dqui_dirty_list to comment.Signed-off-by: Chengguang Xu
Signed-off-by: Jan Kara
01 May, 2019
1 commit
-
When we fail from allocating inode/space, we back out
the change we already did. In a special case which has
exceeded soft limit by the change, we should also check
time limit and reset it properly.Signed-off-by: Chengguang Xu
Signed-off-by: Jan Kara
25 Apr, 2019
2 commits
-
Local variable *reserved* of remove_dquot_ref() is only used if
define CONFIG_QUOTA_DEBUG, but not ebraced in CONFIG_QUOTA_DEBUG
macro, which leads to unused-but-set-variable warning when compiling.This patch ebrace it into CONFIG_QUOTA_DEBUG macro like what is done
in add_dquot_ref().Signed-off-by: Jiang Biao
Signed-off-by: Jan Kara -
We need to check return code only when calling ->read_dqblk(),
so fix it properly.Signed-off-by: Chengguang Xu
Signed-off-by: Jan Kara
26 Mar, 2019
2 commits
-
This removes all trailing whitespaces in fs/quota/.
Signed-off-by: Sascha Hauer
Signed-off-by: Jan Kara -
Replace (flags & DQUOT_SPACE_RESERVE) with
variable reserve.Signed-off-by: Chengguang Xu
Signed-off-by: Jan Kara
20 Jun, 2018
2 commits
-
Use list_first_entry() and list_empty() instead of opencoded variants.
Reviewed-by: Matthew Wilcox
Signed-off-by: Jan Kara -
The dquots in the free_dquots list are not reclaimed in LRU way.
put_dquot_last() puts entries to the tail and dqcache_shrink_scan()
frees from the tail. Free unreferenced dquots in LRU order because it
seems more reasonable than freeing most recently used.Signed-off-by: Greg Thelen
Signed-off-by: Shakeel Butt
Signed-off-by: Jan Kara
09 Apr, 2018
1 commit
-
dquot_init() is never called in atomic context.
This function is only set as a parameter of fs_initcall().Despite never getting called from atomic context,
dquot_init() calls __get_free_pages() with GFP_ATOMIC,
which waits busily for allocation.
GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL,
to avoid busy waiting and improve the possibility of sucessful allocation.This is found by a static analysis tool named DCNS written by myself.
And I also manually check it.Signed-off-by: Jia-Ju Bai
Signed-off-by: Jan Kara
29 Nov, 2017
1 commit
-
register_shrinker() might return -ENOMEM error since Linux 3.12.
Call panic() as with other failure checks in this function if
register_shrinker() failed.Fixes: 1d3d4437eae1 ("vmscan: per-node deferred work")
Signed-off-by: Tetsuo Handa
Cc: Jan Kara
Cc: Michal Hocko
Reviewed-by: Michal Hocko
Signed-off-by: Jan Kara
28 Nov, 2017
1 commit
-
In commit 6184fc0b8dd7 ("quota: Propagate error from ->acquire_dquot()"),
we have propagated error from __dquot_initialize to caller, but we forgot
to handle such error in add_dquot_ref(), so, currently, during quota
accounting information initialization flow, if we failed for some of
inodes, we just ignore such error, and do account for others, which is
not a good implementation.In this patch, we choose to let user be aware of such error, so after
turning on quota successfully, we can make sure all inodes disk usage
can be accounted, which will be more reasonable.Suggested-by: Jan Kara
Signed-off-by: Chao Yu
Signed-off-by: Jan Kara
15 Nov, 2017
1 commit
-
Pull quota, ext2, isofs and udf fixes from Jan Kara:
- two small quota error handling fixes
- two isofs fixes for architectures with signed char
- several udf block number overflow and signedness fixes
- ext2 rework of mount option handling to avoid GFP_KERNEL allocation
with spinlock held- ... it also contains a patch to implement auditing of responses to
fanotify permission events. That should have been in the fanotify
pull request but I mistakenly merged that patch into a wrong branch
and noticed only now at which point I don't think it's worth rebasing
and redoing.* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
quota: be aware of error from dquot_initialize
quota: fix potential infinite loop
isofs: use unsigned char types consistently
isofs: fix timestamps beyond 2027
udf: Fix some sign-conversion warnings
udf: Fix signed/unsigned format specifiers
udf: Fix 64-bit sign extension issues affecting blocks > 0x7FFFFFFF
udf: Remove some outdate references from documentation
udf: Avoid overflow when session starts at large offset
ext2: Fix possible sleep in atomic during mount option parsing
ext2: Parse mount options into a dedicated structure
audit: Record fanotify access control decisions
14 Nov, 2017
1 commit
13 Nov, 2017
1 commit
-
Commit 6184fc0b8dd7 ("quota: Propagate error from ->acquire_dquot()")
missed to handle error from dquot_initialize in dquot_file_open, fix it.Signed-off-by: Chao Yu
Signed-off-by: Jan Kara
02 Nov, 2017
1 commit
-
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.By default all files without license information are under the default
license of the kernel, which is GPL version 2.Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
01 Nov, 2017
1 commit
-
In dquot_writeback_dquots(), we write back dquot from dirty dquots
list. There is a potential infinite loop if ->write_dquot() failure
and forget remove dquot from the list. This patch clear dirty bit
anyway to avoid it.Signed-off-by: zhangyi (F)
Signed-off-by: Jan Kara
10 Oct, 2017
1 commit
-
Eryu has reported that since commit 7b9ca4c61bc2 "quota: Reduce
contention on dq_data_lock" test generic/233 occasionally fails. This is
caused by the fact that since that commit we don't generate warning and
set grace time for quota allocations that have DQUOT_SPACE_NOFAIL set
(these are for example some metadata allocations in ext4). We need these
allocations to behave regularly wrt warning generation and grace time
setting so fix the code to return to the original behavior.Reported-and-tested-by: Eryu Guan
CC: stable@vger.kernel.org
Fixes: 7b9ca4c61bc278b771fb57d6290a31ab1fc7fdac
Signed-off-by: Jan Kara
18 Sep, 2017
1 commit
-
Lock dq_dqb_lock around dquot_decr_inodes()
Signed-off-by: Konstantin Khlebnikov
Fixes: 7b9ca4c61bc2 ("quota: Reduce contention on dq_data_lock")
Signed-off-by: Jan Kara
08 Sep, 2017
1 commit
-
Pull quota scaling updates from Jan Kara:
"This contains changes to make the quota subsystem more scalable.Reportedly it improves number of files created per second on ext4
filesystem on fast storage by about a factor of 2x"* 'quota_scaling' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (28 commits)
quota: Add lock annotations to struct members
quota: Reduce contention on dq_data_lock
fs: Provide __inode_get_bytes()
quota: Inline dquot_[re]claim_reserved_space() into callsite
quota: Inline inode_{incr,decr}_space() into callsites
quota: Inline functions into their callsites
ext4: Disable dirty list tracking of dquots when journalling quotas
quota: Allow disabling tracking of dirty dquots in a list
quota: Remove dq_wait_unused from dquot
quota: Move locking into clear_dquot_dirty()
quota: Do not dirty bad dquots
quota: Fix possible corruption of dqi_flags
quota: Propagate ->quota_read errors from v2_read_file_info()
quota: Fix error codes in v2_read_file_info()
quota: Push dqio_sem down to ->read_file_info()
quota: Push dqio_sem down to ->write_file_info()
quota: Push dqio_sem down to ->get_next_id()
quota: Push dqio_sem down to ->release_dqblk()
quota: Remove locking for writing to the old quota format
quota: Do not acquire dqio_sem for dquot overwrites in v2 format
...
18 Aug, 2017
3 commits
-
dq_data_lock is currently used to protect all modifications of quota
accounting information, consistency of quota accounting on the inode,
and dquot pointers from inode. As a result contention on the lock can be
pretty heavy.Reduce the contention on the lock by protecting quota accounting
information by a new dquot->dq_dqb_lock and consistency of quota
accounting with inode usage by inode->i_lock.This change reduces time to create 500000 files on ext4 on ramdisk by 50
different processes in separate directories by 6% when user quota is
turned on. When those 50 processes belong to 50 different users, the
improvement is about 9%.Signed-off-by: Jan Kara
-
dquot_claim_reserved_space() and dquot_reclaim_reserved_space() have
only a single callsite. Inline them there.Signed-off-by: Jan Kara
-
inode_incr_space() and inode_decr_space() have only two callsites.
Inline them there as that will make locking changes simpler.Signed-off-by: Jan Kara