09 Feb, 2020
1 commit
-
Pull misc vfs updates from Al Viro:
- bmap series from cmaiolino
- getting rid of convolutions in copy_mount_options() (use a couple of
copy_from_user() instead of the __get_user() crap)* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
saner copy_mount_options()
fibmap: Reject negative block numbers
fibmap: Use bmap instead of ->bmap method in ioctl_fibmap
ecryptfs: drop direct calls to ->bmap
cachefiles: drop direct usage of ->bmap method.
fs: Enable bmap() function to properly return errors
05 Feb, 2020
1 commit
-
Pull vfs timestamp updates from Al Viro:
"More 64bit timestamp work"* 'imm.timestamp' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
kernfs: don't bother with timestamp truncation
fs: Do not overload update_time
fs: Delete timespec64_trunc()
fs: ubifs: Eliminate timespec64_trunc() usage
fs: ceph: Delete timespec64_trunc() usage
fs: cifs: Delete usage of timespec64_trunc
fs: fat: Eliminate timespec64_trunc() usage
utimes: Clamp the timestamps in notify_change()
04 Feb, 2020
1 commit
-
'PTR_ERR(p) == -E*' is a stronger condition than IS_ERR(p).
Hence, IS_ERR(p) is unneeded.The semantic patch that generates this commit is as follows:
//
@@
expression ptr;
constant error_code;
@@
-IS_ERR(ptr) && (PTR_ERR(ptr) == - error_code)
+PTR_ERR(ptr) == - error_code
//Link: http://lkml.kernel.org/r/20200106045833.1725-1-masahiroy@kernel.org
Signed-off-by: Masahiro Yamada
Cc: Julia Lawall
Acked-by: Stephen Boyd [drivers/clk/clk.c]
Acked-by: Bartosz Golaszewski [GPIO]
Acked-by: Wolfram Sang [drivers/i2c]
Acked-by: Rafael J. Wysocki [acpi/scan.c]
Acked-by: Rob Herring
Cc: Eric Biggers
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
03 Feb, 2020
1 commit
-
By now, bmap() will either return the physical block number related to
the requested file offset or 0 in case of error or the requested offset
maps into a hole.
This patch makes the needed changes to enable bmap() to proper return
errors, using the return value as an error return, and now, a pointer
must be passed to bmap() to be filled with the mapped physical block.It will change the behavior of bmap() on return:
- negative value in case of error
- zero on success or map fell into a holeIn case of a hole, the *block will be zero too
Since this is a prep patch, by now, the only error return is -EINVAL if
->bmap doesn't exist.Reviewed-by: Christoph Hellwig
Signed-off-by: Carlos Maiolino
Signed-off-by: Al Viro
31 Jan, 2020
1 commit
-
Pull f2fs updates from Jaegeuk Kim:
"In this series, we've implemented transparent compression
experimentally. It supports LZO and LZ4, but will add more later as we
investigate in the field more.At this point, the feature doesn't expose compressed space to user
directly in order to guarantee potential data updates later to the
space. Instead, the main goal is to reduce data writes to flash disk
as much as possible, resulting in extending disk life time as well as
relaxing IO congestion.Alternatively, we're also considering to add ioctl() to reclaim
compressed space and show it to user after putting the immutable bit.Enhancements:
- add compression support
- avoid unnecessary locks in quota ops
- harden power-cut scenario for zoned block devices
- use private bio_set to avoid IO congestion
- replace GC mutex with rwsem to serialize callersBug fixes:
- fix dentry consistency and memory corruption in rename()'s error case
- fix wrong swap extent reports
- fix casefolding bugs
- change lock coverage to avoid deadlock
- avoid GFP_KERNEL under f2fs_lock_opAnd, we've cleaned up sysfs entries to prepare no debugfs"
* tag 'f2fs-for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (31 commits)
f2fs: fix race conditions in ->d_compare() and ->d_hash()
f2fs: fix dcache lookup of !casefolded directories
f2fs: Add f2fs stats to sysfs
f2fs: delete duplicate information on sysfs nodes
f2fs: change to use rwsem for gc_mutex
f2fs: update f2fs document regarding to fsync_mode
f2fs: add a way to turn off ipu bio cache
f2fs: code cleanup for f2fs_statfs_project()
f2fs: fix miscounted block limit in f2fs_statfs_project()
f2fs: show the CP_PAUSE reason in checkpoint traces
f2fs: fix deadlock allocating bio_post_read_ctx from mempool
f2fs: remove unneeded check for error allocating bio_post_read_ctx
f2fs: convert inline_dir early before starting rename
f2fs: fix memleak of kobject
f2fs: fix to add swap extent correctly
f2fs: run fsck when getting bad inode during GC
f2fs: support data compression
f2fs: free sysfs kobject
f2fs: declare nested quota_sem and remove unnecessary sems
f2fs: don't put new_page twice in f2fs_rename
...
29 Jan, 2020
1 commit
-
Pull fsverity updates from Eric Biggers:
- Optimize fs-verity sequential read performance by implementing
readahead of Merkle tree pages. This allows the Merkle tree to be
read in larger chunks.- Optimize FS_IOC_ENABLE_VERITY performance in the uncached case by
implementing readahead of data pages.- Allocate the hash requests from a mempool in order to eliminate the
possibility of allocation failures during I/O.* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
fs-verity: use u64_to_user_ptr()
fs-verity: use mempool for hash requests
fs-verity: implement readahead of Merkle tree pages
fs-verity: implement readahead for FS_IOC_ENABLE_VERITY
25 Jan, 2020
2 commits
-
Since ->d_compare() and ->d_hash() can be called in RCU-walk mode,
->d_parent and ->d_inode can be concurrently modified, and in
particular, ->d_inode may be changed to NULL. For f2fs_d_hash() this
resulted in a reproducible NULL dereference if a lookup is done in a
directory being deleted, e.g. with:int main()
{
if (fork()) {
for (;;) {
mkdir("subdir", 0700);
rmdir("subdir");
}
} else {
for (;;)
access("subdir/file", 0);
}
}... or by running the 't_encrypted_d_revalidate' program from xfstests.
Both repros work in any directory on a filesystem with the encoding
feature, even if the directory doesn't actually have the casefold flag.I couldn't reproduce a crash in f2fs_d_compare(), but it appears that a
similar crash is possible there.Fix these bugs by reading ->d_parent and ->d_inode using READ_ONCE() and
falling back to the case sensitive behavior if the inode is NULL.Reported-by: Al Viro
Fixes: 2c2eb7a300cd ("f2fs: Support case-insensitive file name lookups")
Cc: # v5.4+
Signed-off-by: Eric Biggers
Signed-off-by: Jaegeuk Kim -
Do the name comparison for non-casefolded directories correctly.
This is analogous to ext4's commit 66883da1eee8 ("ext4: fix dcache
lookup of !casefolded directories").Fixes: 2c2eb7a300cd ("f2fs: Support case-insensitive file name lookups")
Cc: # v5.4+
Signed-off-by: Eric Biggers
Signed-off-by: Jaegeuk Kim
24 Jan, 2020
1 commit
-
Currently f2fs stats are only available from /d/f2fs/status. This patch
adds some of the f2fs stats to sysfs so that they are accessible even
when debugfs is not mounted.The following sysfs nodes are added:
-/sys/fs/f2fs//free_segments
-/sys/fs/f2fs//cp_foreground_calls
-/sys/fs/f2fs//cp_background_calls
-/sys/fs/f2fs//gc_foreground_calls
-/sys/fs/f2fs//gc_background_calls
-/sys/fs/f2fs//moved_blocks_foreground
-/sys/fs/f2fs//moved_blocks_background
-/sys/fs/f2fs//avg_vblocksSigned-off-by: Hridya Valsaraju
[Jaegeuk Kim: allow STAT_FS without DEBUG_FS]
Signed-off-by: Jaegeuk Kim
18 Jan, 2020
12 commits
-
Mutex lock won't serialize callers, in order to avoid starving of unlucky
caller, let's use rwsem lock instead.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
This patch adds missing fsync_mode entry in f2fs document.
Fixes: 04485987f053 ("f2fs: introduce async IPU policy")
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
Setting 0x40 in /sys/fs/f2fs/dev/ipu_policy gives a way to turn off
bio cache, which is useufl to check whether block layer using hardware
encryption engine merges IOs correctly.Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
Calling min_not_zero() to simplify complicated prjquota
limit comparison in f2fs_statfs_project().Signed-off-by: Chengguang Xu
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
statfs calculates Total/Used/Avail disk space in block unit,
so we should translate soft/hard prjquota limit to block unit
as well.Below testing result shows the block/inode numbers of
Total/Used/Avail from df command are all correct afer
applying this patch.[root@localhost quota-tools]\# ./repquota -P /dev/sdb1
*** Report for project quotas on device /dev/sdb1
Block grace time: 7days; Inode grace time: 7days
Block limits File limits
Project used soft hard grace used soft hard grace
-----------------------------------------------------------
\#0 -- 4 0 0 1 0 0
\#101 -- 0 0 0 2 0 0
\#102 -- 0 10240 0 2 10 0
\#103 -- 0 0 20480 2 0 20
\#104 -- 0 10240 20480 2 10 20
\#105 -- 0 20480 10240 2 20 10[root@localhost sdb1]\# lsattr -p t{1,2,3,4,5}
101 ----------------N-- t1/a1
102 ----------------N-- t2/a2
103 ----------------N-- t3/a3
104 ----------------N-- t4/a4
105 ----------------N-- t5/a5[root@localhost sdb1]\# df -hi t{1,2,3,4,5}
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sdb1 2.4M 21 2.4M 1% /mnt/sdb1
/dev/sdb1 10 2 8 20% /mnt/sdb1
/dev/sdb1 20 2 18 10% /mnt/sdb1
/dev/sdb1 10 2 8 20% /mnt/sdb1
/dev/sdb1 10 2 8 20% /mnt/sdb1[root@localhost sdb1]\# df -h t{1,2,3,4,5}
Filesystem Size Used Avail Use% Mounted on
/dev/sdb1 10G 489M 9.6G 5% /mnt/sdb1
/dev/sdb1 10M 0 10M 0% /mnt/sdb1
/dev/sdb1 20M 0 20M 0% /mnt/sdb1
/dev/sdb1 10M 0 10M 0% /mnt/sdb1
/dev/sdb1 10M 0 10M 0% /mnt/sdb1Fixes: 909110c060f2 ("f2fs: choose hardlimit when softlimit is larger than hardlimit in f2fs_statfs_project()")
Signed-off-by: Chengguang Xu
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
Without any form of coordination, any case where multiple allocations
from the same mempool are needed at a time to make forward progress can
deadlock under memory pressure.This is the case for struct bio_post_read_ctx, as one can be allocated
to decrypt a Merkle tree page during fsverity_verify_bio(), which itself
is running from a post-read callback for a data bio which has its own
struct bio_post_read_ctx.Fix this by freeing first bio_post_read_ctx before calling
fsverity_verify_bio(). This works because verity (if enabled) is always
the last post-read step.This deadlock can be reproduced by trying to read from an encrypted
verity file after reducing NUM_PREALLOC_POST_READ_CTXS to 1 and patching
mempool_alloc() to pretend that pool->alloc() always fails.Note that since NUM_PREALLOC_POST_READ_CTXS is actually 128, to actually
hit this bug in practice would require reading from lots of encrypted
verity files at the same time. But it's theoretically possible, as N
available objects doesn't guarantee forward progress when > N/2 threads
each need 2 objects at a time.Fixes: 95ae251fe828 ("f2fs: add fs-verity support")
Signed-off-by: Eric Biggers
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
Since allocating an object from a mempool never fails when
__GFP_DIRECT_RECLAIM (which is included in GFP_NOFS) is set, the check
for failure to allocate a bio_post_read_ctx is unnecessary. Remove it.Signed-off-by: Eric Biggers
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
If we hit an error during rename, we'll get two dentries in different
directories.Chao adds to check the room in inline_dir which can avoid needless
inversion. This should be done by inode_lock(&old_dir).Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
If kobject_init_and_add() failed, caller needs to invoke kobject_put()
to release kobject explicitly.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
As Youling reported in mailing list:
https://www.linuxquestions.org/questions/linux-newbie-8/the-file-system-f2fs-is-broken-4175666043/
https://www.linux.org/threads/the-file-system-f2fs-is-broken.26490/
There is a test case can corrupt f2fs image:
- dd if=/dev/zero of=/swapfile bs=1M count=4096
- chmod 600 /swapfile
- mkswap /swapfile
- swapon --discard /swapfileThe root cause is f2fs_swap_activate() intends to return zero value
to setup_swap_extents() to enable SWP_FS mode (swap file goes through
fs), in this flow, setup_swap_extents() setups swap extent with wrong
block address range, result in discard_swap() erasing incorrect address.Because f2fs_swap_activate() has pinned swapfile, its data block
address will not change, it's safe to let swap to handle IO through
raw device, so we can get rid of SWAP_FS mode and initial swap extents
inside f2fs_swap_activate(), by this way, later discard_swap() can trim
in right address range.Fixes: 4969c06a0d83 ("f2fs: support swap file w/ DIO")
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
This is to avoid inifinite GC when trying to disable checkpoint.
Signed-off-by: Jaegeuk Kim
-
This patch tries to support compression in f2fs.
- New term named cluster is defined as basic unit of compression, file can
be divided into multiple clusters logically. One cluster includes 4 << n
(n >= 0) logical pages, compression size is also cluster size, each of
cluster can be compressed or not.- In cluster metadata layout, one special flag is used to indicate cluster
is compressed one or normal one, for compressed cluster, following metadata
maps cluster to [1, 4 << n - 1] physical blocks, in where f2fs stores
data including compress header and compressed data.- In order to eliminate write amplification during overwrite, F2FS only
support compression on write-once file, data can be compressed only when
all logical blocks in file are valid and cluster compress ratio is lower
than specified threshold.- To enable compression on regular inode, there are three ways:
* chattr +c file
* chattr +c dir; touch dir/file
* mount w/ -o compress_extension=ext; touch file.extCompress metadata layout:
[Dnode Structure]
+-----------------------------------------------+
| cluster 1 | cluster 2 | ......... | cluster N |
+-----------------------------------------------+
. . . .
. . . .
. Compressed Cluster . . Normal Cluster .
+----------+---------+---------+---------+ +---------+---------+---------+---------+
|compr flag| block 1 | block 2 | block 3 | | block 1 | block 2 | block 3 | block 4 |
+----------+---------+---------+---------+ +---------+---------+---------+---------+
. .
. .
. .
+-------------+-------------+----------+----------------------------+
| data length | data chksum | reserved | compressed data |
+-------------+-------------+----------+----------------------------+Changelog:
20190326:
- fix error handling of read_end_io().
- remove unneeded comments in f2fs_encrypt_one_page().20190327:
- fix wrong use of f2fs_cluster_is_full() in f2fs_mpage_readpages().
- don't jump into loop directly to avoid uninitialized variables.
- add TODO tag in error path of f2fs_write_cache_pages().20190328:
- fix wrong merge condition in f2fs_read_multi_pages().
- check compressed file in f2fs_post_read_required().20190401
- allow overwrite on non-compressed cluster.
- check cluster meta before writing compressed data.20190402
- don't preallocate blocks for compressed file.- add lz4 compress algorithm
- process multiple post read works in one workqueue
Now f2fs supports processing post read work in multiple workqueue,
it shows low performance due to schedule overhead of multiple
workqueue executing orderly.20190921
- compress: support buffered overwrite
C: compress cluster flag
V: valid block address
N: NEW_ADDROne cluster contain 4 blocks
before overwrite after overwrite
- VVVV -> CVNN
- CVNN -> VVVV- CVNN -> CVNN
- CVNN -> CVVV- CVVV -> CVNN
- CVVV -> CVVV20191029
- add kconfig F2FS_FS_COMPRESSION to isolate compression related
codes, add kconfig F2FS_FS_{LZO,LZ4} to cover backend algorithm.
note that: will remove lzo backend if Jaegeuk agreed that too.
- update codes according to Eric's comments.20191101
- apply fixes from Jaegeuk20191113
- apply fixes from Jaegeuk
- split workqueue for fsverity20191216
- apply fixes from Jaegeuk20200117
- fix to avoid NULL pointer dereference[Jaegeuk Kim]
- add tracepoint for f2fs_{,de}compress_pages()
- fix many bugs and add some compression stats
- fix overwrite/mmap bugs
- address 32bit build error, reported by Geert.
- bug fixes when handling errors and i_compressed_blocksReported-by:
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
16 Jan, 2020
9 commits
-
Detected kmemleak.
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
1.
f2fs_quota_sync
-> down_read(&sbi->quota_sem)
-> dquot_writeback_dquots
-> f2fs_dquot_commit
-> down_read(&sbi->quota_sem)2.
f2fs_quota_sync
-> down_read(&sbi->quota_sem)
-> f2fs_write_data_pages
-> f2fs_write_single_data_page
-> down_write(&F2FS_I(inode)->i_sem)f2fs_mkdir
-> f2fs_do_add_link
-> down_write(&F2FS_I(inode)->i_sem)
-> f2fs_init_inode_metadata
-> f2fs_new_node_page
-> dquot_alloc_inode
-> f2fs_dquot_mark_dquot_dirty
-> down_read(&sbi->quota_sem)Signed-off-by: Jaegeuk Kim
-
In f2fs_rename(), new_page is gone after f2fs_set_link(), but it tries
to put again when whiteout is failed and jumped to put_out_dir.Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
This patch moves setting I_LINKABLE early in rename2(whiteout) to avoid the
below warning.[ 3189.163385] WARNING: CPU: 3 PID: 59523 at fs/inode.c:358 inc_nlink+0x32/0x40
[ 3189.246979] Call Trace:
[ 3189.248707] f2fs_init_inode_metadata+0x2d6/0x440 [f2fs]
[ 3189.251399] f2fs_add_inline_entry+0x162/0x8c0 [f2fs]
[ 3189.254010] f2fs_add_dentry+0x69/0xe0 [f2fs]
[ 3189.256353] f2fs_do_add_link+0xc5/0x100 [f2fs]
[ 3189.258774] f2fs_rename2+0xabf/0x1010 [f2fs]
[ 3189.261079] vfs_rename+0x3f8/0xaa0
[ 3189.263056] ? tomoyo_path_rename+0x44/0x60
[ 3189.265283] ? do_renameat2+0x49b/0x550
[ 3189.267324] do_renameat2+0x49b/0x550
[ 3189.269316] __x64_sys_renameat2+0x20/0x30
[ 3189.271441] do_syscall_64+0x5a/0x230
[ 3189.273410] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 3189.275848] RIP: 0033:0x7f270b4d9a49Signed-off-by: Jaegeuk Kim
-
META_MAPPING is used to move blocks for both encrypted and verity files.
So the META_MAPPING invalidation condition in do_checkpoint() should
consider verity too, not just encrypt.Signed-off-by: Eric Biggers
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
In low memory scenario, we can allocate multiple bios without
submitting any of them.- f2fs_write_checkpoint()
- block_operations()
- f2fs_sync_node_pages()
step 1) flush cold nodes, allocate new bio from mempool
- bio_alloc()
- mempool_alloc()
step 2) flush hot nodes, allocate a bio from mempool
- bio_alloc()
- mempool_alloc()
step 3) flush warm nodes, be stuck in below call path
- bio_alloc()
- mempool_alloc()
- loop to wait mempool element release, as we only
reserved memory for two bio allocation, however above
allocated two bios may never be submitted.So we need avoid using default bioset, in this patch we introduce a
private bioset, in where we enlarg mempool element count to total
number of log header, so that we can make sure we have enough
backuped memory pool in scenario of allocating/holding multiple
bios.Signed-off-by: Gao Xiang
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
Remove duplicate sbi->aw_cnt stats counter that tracks
the number of atomic files currently opened (it also shows
incorrect value sometimes). Use more relit lable sbi->atomic_files
to show in the stats.Signed-off-by: Sahitya Tummala
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
To catch f2fs bugs in write pointer handling code for zoned block
devices, check write pointers of non-open zones that current segments do
not point to. Do this check at mount time, after the fsync data recovery
and current segments' write pointer consistency fix. Or when fsync data
recovery is disabled by mount option, do the check when there is no fsync
data.Check two items comparing write pointers with valid block maps in SIT.
The first item is check for zones with no valid blocks. When there is no
valid blocks in a zone, the write pointer should be at the start of the
zone. If not, next write operation to the zone will cause unaligned write
error. If write pointer is not at the zone start, reset the write pointer
to place at the zone start.The second item is check between the write pointer position and the last
valid block in the zone. It is unexpected that the last valid block
position is beyond the write pointer. In such a case, report as a bug.
Fix is not required for such zone, because the zone is not selected for
next write operation until the zone get discarded.Signed-off-by: Shin'ichiro Kawasaki
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
On sudden f2fs shutdown, write pointers of zoned block devices can go
further but f2fs meta data keeps current segments at positions before the
write operations. After remounting the f2fs, this inconsistency causes
write operations not at write pointers and "Unaligned write command"
error is reported.To avoid the error, compare current segments with write pointers of open
zones the current segments point to, during mount operation. If the write
pointer position is not aligned with the current segment position, assign
a new zone to the current segment. Also check the newly assigned zone has
write pointer at zone start. If not, reset write pointer of the zone.Perform the consistency check during fsync recovery. Not to lose the
fsync data, do the check after fsync data gets restored and before
checkpoint commit which flushes data at current segment positions. Not to
cause conflict with kworker's dirfy data/node flush, do the fix within
SBI_POR_DOING protection.Signed-off-by: Shin'ichiro Kawasaki
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim
15 Jan, 2020
1 commit
-
When fs-verity verifies data pages, currently it reads each Merkle tree
page synchronously using read_mapping_page().Therefore, when the Merkle tree pages aren't already cached, fs-verity
causes an extra 4 KiB I/O request for every 512 KiB of data (assuming
that the Merkle tree uses SHA-256 and 4 KiB blocks). This results in
more I/O requests and performance loss than is strictly necessary.Therefore, implement readahead of the Merkle tree pages.
For simplicity, we take advantage of the fact that the kernel already
does readahead of the file's *data*, just like it does for any other
file. Due to this, we don't really need a separate readahead state
(struct file_ra_state) just for the Merkle tree, but rather we just need
to piggy-back on the existing data readahead requests.We also only really need to bother with the first level of the Merkle
tree, since the usual fan-out factor is 128, so normally over 99% of
Merkle tree I/O requests are for the first level.Therefore, make fsverity_verify_bio() enable readahead of the first
Merkle tree level, for up to 1/4 the number of pages in the bio, when it
sees that the REQ_RAHEAD flag is set on the bio. The readahead size is
then passed down to ->read_merkle_tree_page() for the filesystem to
(optionally) implement if it sees that the requested page is uncached.While we're at it, also make build_merkle_tree_level() set the Merkle
tree readahead size, since it's easy to do there.However, for now don't set the readahead size in fsverity_verify_page(),
since currently it's only used to verify holes on ext4 and f2fs, and it
would need parameters added to know how much to read ahead.This patch significantly improves fs-verity sequential read performance.
Some quick benchmarks with 'cat'-ing a 250MB file after dropping caches:On an ARM64 phone (using sha256-ce):
Before: 217 MB/s
After: 263 MB/s
(compare to sha256sum of non-verity file: 357 MB/s)In an x86_64 VM (using sha256-avx2):
Before: 173 MB/s
After: 215 MB/s
(compare to sha256sum of non-verity file: 223 MB/s)Link: https://lore.kernel.org/r/20200106205533.137005-1-ebiggers@kernel.org
Reviewed-by: Theodore Ts'o
Signed-off-by: Eric Biggers
01 Jan, 2020
2 commits
-
The commit 643fa9612bf1 ("fscrypt: remove filesystem specific
build config option") removed modular support for fs/crypto. This
causes the Crypto API to be built-in whenever fscrypt is enabled.
This makes it very difficult for me to test modular builds of
the Crypto API without disabling fscrypt which is a pain.As fscrypt is still evolving and it's developing new ties with the
fs layer, it's hard to build it as a module for now.However, the actual algorithms are not required until a filesystem
is mounted. Therefore we can allow them to be built as modules.Signed-off-by: Herbert Xu
Link: https://lore.kernel.org/r/20191227024700.7vrzuux32uyfdgum@gondor.apana.org.au
Signed-off-by: Eric Biggers -
fscrypt_get_encryption_info() returns 0 if the encryption key is
unavailable; it never returns ENOKEY. So remove checks for ENOKEY.Link: https://lore.kernel.org/r/20191209212348.243331-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers
13 Dec, 2019
3 commits
-
Otherwise, it can cause circular locking dependency reported by mm.
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
We need to use GFP_NOFS, since we did f2fs_lock_op().
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
This patch avoids some unnecessary locks for quota files when write_begin
fails.Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim
11 Dec, 2019
1 commit
-
Otherwise, we can hit deadlock by waiting for the locked page in
move_data_block in GC.Thread A Thread B
- do_page_mkwrite
- f2fs_vm_page_mkwrite
- lock_page
- f2fs_balance_fs
- mutex_lock(gc_mutex)
- f2fs_gc
- do_garbage_collect
- ra_data_block
- grab_cache_page
- f2fs_balance_fs
- mutex_lock(gc_mutex)Fixes: 39a8695824510 ("f2fs: refactor ->page_mkwrite() flow")
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim
10 Dec, 2019
1 commit
-
The previous preallocation and DIO decision like below.
allow_outplace_dio !allow_outplace_dio
f2fs_force_buffered_io (*) No_Prealloc / Buffered_IO Prealloc / Buffered_IO
!f2fs_force_buffered_io No_Prealloc / DIO Prealloc / DIOBut, Javier reported Case (*) where zoned device bypassed preallocation but
fell back to buffered writes in f2fs_direct_IO(), resulting in stale data
being read.In order to fix the issue, actually we need to preallocate blocks whenever
we fall back to buffered IO like this. No change is made in the other cases.allow_outplace_dio !allow_outplace_dio
f2fs_force_buffered_io (*) Prealloc / Buffered_IO Prealloc / Buffered_IO
!f2fs_force_buffered_io No_Prealloc / DIO Prealloc / DIOReported-and-tested-by: Javier Gonzalez
Signed-off-by: Damien Le Moal
Tested-by: Shin'ichiro Kawasaki
Reviewed-by: Chao Yu
Reviewed-by: Javier González
Signed-off-by: Jaegeuk Kim
09 Dec, 2019
1 commit
-
Push clamping timestamps into notify_change(), so in-kernel
callers like nfsd and overlayfs will get similar timestamp
set behavior as utimes.AV: get rid of clamping in ->setattr() instances; we don't need
to bother with that there, with notify_change() doing normalization
in all cases now (it already did for implicit case, since current_time()
clamps).Suggested-by: Miklos Szeredi
Fixes: 42e729b9ddbb ("utimes: Clamp the timestamps before update")
Cc: stable@vger.kernel.org # v5.4
Cc: Deepa Dinamani
Cc: Jeff Layton
Signed-off-by: Amir Goldstein
Signed-off-by: Al Viro
02 Dec, 2019
1 commit
-
Pull removal of most of fs/compat_ioctl.c from Arnd Bergmann:
"As part of the cleanup of some remaining y2038 issues, I came to
fs/compat_ioctl.c, which still has a couple of commands that need
support for time64_t.In completely unrelated work, I spent time on cleaning up parts of
this file in the past, moving things out into drivers instead.After Al Viro reviewed an earlier version of this series and did a lot
more of that cleanup, I decided to try to completely eliminate the
rest of it and move it all into drivers.This series incorporates some of Al's work and many patches of my own,
but in the end stops short of actually removing the last part, which
is the scsi ioctl handlers. I have patches for those as well, but they
need more testing or possibly a rewrite"* tag 'compat-ioctl-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground: (42 commits)
scsi: sd: enable compat ioctls for sed-opal
pktcdvd: add compat_ioctl handler
compat_ioctl: move SG_GET_REQUEST_TABLE handling
compat_ioctl: ppp: move simple commands into ppp_generic.c
compat_ioctl: handle PPPIOCGIDLE for 64-bit time_t
compat_ioctl: move PPPIOCSCOMPRESS to ppp_generic
compat_ioctl: unify copy-in of ppp filters
tty: handle compat PPP ioctls
compat_ioctl: move SIOCOUTQ out of compat_ioctl.c
compat_ioctl: handle SIOCOUTQNSD
af_unix: add compat_ioctl support
compat_ioctl: reimplement SG_IO handling
compat_ioctl: move WDIOC handling into wdt drivers
fs: compat_ioctl: move FITRIM emulation into file systems
gfs2: add compat_ioctl support
compat_ioctl: remove unused convert_in_user macro
compat_ioctl: remove last RAID handling code
compat_ioctl: remove /dev/raw ioctl translation
compat_ioctl: remove PCI ioctl translation
compat_ioctl: remove joystick ioctl translation
...