Eric Lee / smarc-fsl-linux-kernel

05 Mar, 2020

1 commit

7fd3329a4 f2fs: fix to add swap extent correctly ... Browse Code »

commit 3e5e479a39ce9ed60cd63f7565cc1d9da77c2a4e upstream.

As Youling reported in mailing list:

https://www.linuxquestions.org/questions/linux-newbie-8/the-file-system-f2fs-is-broken-4175666043/

https://www.linux.org/threads/the-file-system-f2fs-is-broken.26490/

There is a test case can corrupt f2fs image:
- dd if=/dev/zero of=/swapfile bs=1M count=4096
- chmod 600 /swapfile
- mkswap /swapfile
- swapon --discard /swapfile

The root cause is f2fs_swap_activate() intends to return zero value
to setup_swap_extents() to enable SWP_FS mode (swap file goes through
fs), in this flow, setup_swap_extents() setups swap extent with wrong
block address range, result in discard_swap() erasing incorrect address.

Because f2fs_swap_activate() has pinned swapfile, its data block
address will not change, it's safe to let swap to handle IO through
raw device, so we can get rid of SWAP_FS mode and initial swap extents
inside f2fs_swap_activate(), by this way, later discard_swap() can trim
in right address range.

Fixes: 4969c06a0d83 ("f2fs: support swap file w/ DIO")
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
Signed-off-by: Greg Kroah-Hartman

Chao Yu
2020-03-05 23:43:48 +0800

24 Feb, 2020

5 commits

225a5b5be f2fs: fix memleak of kobject ... Browse Code »

[ Upstream commit fe396ad8e7526f059f7b8c7290d33a1b84adacab ]

If kobject_init_and_add() failed, caller needs to invoke kobject_put()
to release kobject explicitly.

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
Signed-off-by: Sasha Levin

Chao Yu
2020-02-24 15:36:53 +0800
0016939be f2fs: free sysfs kobject ... Browse Code »

[ Upstream commit 820d366736c949ffe698d3b3fe1266a91da1766d ]

Detected kmemleak.

Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim
Signed-off-by: Sasha Levin

Jaegeuk Kim
2020-02-24 15:36:50 +0800
06c34c604 f2fs: set I_LINKABLE early to avoid wrong access by vfs ... Browse Code »

[ Upstream commit 5b1dbb082f196278f82b6a15a13848efacb9ff11 ]

This patch moves setting I_LINKABLE early in rename2(whiteout) to avoid the
below warning.

[ 3189.163385] WARNING: CPU: 3 PID: 59523 at fs/inode.c:358 inc_nlink+0x32/0x40
[ 3189.246979] Call Trace:
[ 3189.248707] f2fs_init_inode_metadata+0x2d6/0x440 [f2fs]
[ 3189.251399] f2fs_add_inline_entry+0x162/0x8c0 [f2fs]
[ 3189.254010] f2fs_add_dentry+0x69/0xe0 [f2fs]
[ 3189.256353] f2fs_do_add_link+0xc5/0x100 [f2fs]
[ 3189.258774] f2fs_rename2+0xabf/0x1010 [f2fs]
[ 3189.261079] vfs_rename+0x3f8/0xaa0
[ 3189.263056] ? tomoyo_path_rename+0x44/0x60
[ 3189.265283] ? do_renameat2+0x49b/0x550
[ 3189.267324] do_renameat2+0x49b/0x550
[ 3189.269316] __x64_sys_renameat2+0x20/0x30
[ 3189.271441] do_syscall_64+0x5a/0x230
[ 3189.273410] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 3189.275848] RIP: 0033:0x7f270b4d9a49

Signed-off-by: Jaegeuk Kim
Signed-off-by: Sasha Levin

Jaegeuk Kim
2020-02-24 15:36:50 +0800
85275286d f2fs: call f2fs_balance_fs outside of locked page ... Browse Code »

[ Upstream commit bdf03299248916640a835a05d32841bb3d31912d ]

Otherwise, we can hit deadlock by waiting for the locked page in
move_data_block in GC.

Thread A Thread B
- do_page_mkwrite
- f2fs_vm_page_mkwrite
- lock_page
- f2fs_balance_fs
- mutex_lock(gc_mutex)
- f2fs_gc
- do_garbage_collect
- ra_data_block
- grab_cache_page
- f2fs_balance_fs
- mutex_lock(gc_mutex)

Fixes: 39a8695824510 ("f2fs: refactor ->page_mkwrite() flow")
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim
Signed-off-by: Sasha Levin

Jaegeuk Kim
2020-02-24 15:36:23 +0800
678b25bfd f2fs: preallocate DIO blocks when forcing buffered_io ... Browse Code »

[ Upstream commit 47501f87c61ad2aa234add63e1ae231521dbc3f5 ]

The previous preallocation and DIO decision like below.

allow_outplace_dio !allow_outplace_dio
f2fs_force_buffered_io (*) No_Prealloc / Buffered_IO Prealloc / Buffered_IO
!f2fs_force_buffered_io No_Prealloc / DIO Prealloc / DIO

But, Javier reported Case (*) where zoned device bypassed preallocation but
fell back to buffered writes in f2fs_direct_IO(), resulting in stale data
being read.

In order to fix the issue, actually we need to preallocate blocks whenever
we fall back to buffered IO like this. No change is made in the other cases.

allow_outplace_dio !allow_outplace_dio
f2fs_force_buffered_io (*) Prealloc / Buffered_IO Prealloc / Buffered_IO
!f2fs_force_buffered_io No_Prealloc / DIO Prealloc / DIO

Reported-and-tested-by: Javier Gonzalez
Signed-off-by: Damien Le Moal
Tested-by: Shin'ichiro Kawasaki
Reviewed-by: Chao Yu
Reviewed-by: Javier González
Signed-off-by: Jaegeuk Kim
Signed-off-by: Sasha Levin

Jaegeuk Kim
2020-02-24 15:36:23 +0800

11 Feb, 2020

6 commits

e9116299f f2fs: fix race conditions in ->d_compare() and ->d_hash() ... Browse Code »

commit 80f2388afa6ef985f9c5c228e36705c4d4db4756 upstream.

Since ->d_compare() and ->d_hash() can be called in RCU-walk mode,
->d_parent and ->d_inode can be concurrently modified, and in
particular, ->d_inode may be changed to NULL. For f2fs_d_hash() this
resulted in a reproducible NULL dereference if a lookup is done in a
directory being deleted, e.g. with:

int main()
{
if (fork()) {
for (;;) {
mkdir("subdir", 0700);
rmdir("subdir");
}
} else {
for (;;)
access("subdir/file", 0);
}
}

... or by running the 't_encrypted_d_revalidate' program from xfstests.
Both repros work in any directory on a filesystem with the encoding
feature, even if the directory doesn't actually have the casefold flag.

I couldn't reproduce a crash in f2fs_d_compare(), but it appears that a
similar crash is possible there.

Fix these bugs by reading ->d_parent and ->d_inode using READ_ONCE() and
falling back to the case sensitive behavior if the inode is NULL.

Reported-by: Al Viro
Fixes: 2c2eb7a300cd ("f2fs: Support case-insensitive file name lookups")
Cc: # v5.4+
Signed-off-by: Eric Biggers
Signed-off-by: Jaegeuk Kim
Signed-off-by: Greg Kroah-Hartman

Eric Biggers
2020-02-11 20:35:25 +0800
6d722cd2e f2fs: fix dcache lookup of !casefolded directories ... Browse Code »

commit 5515eae647426169e4b7969271fb207881eba7f6 upstream.

Do the name comparison for non-casefolded directories correctly.

This is analogous to ext4's commit 66883da1eee8 ("ext4: fix dcache
lookup of !casefolded directories").

Fixes: 2c2eb7a300cd ("f2fs: Support case-insensitive file name lookups")
Cc: # v5.4+
Signed-off-by: Eric Biggers
Signed-off-by: Jaegeuk Kim
Signed-off-by: Greg Kroah-Hartman

Eric Biggers
2020-02-11 20:35:25 +0800
f4803553a f2fs: code cleanup for f2fs_statfs_project() ... Browse Code »

commit bf2cbd3c57159c2b639ee8797b52ab5af180bf83 upstream.

Calling min_not_zero() to simplify complicated prjquota
limit comparison in f2fs_statfs_project().

Signed-off-by: Chengguang Xu
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim
Signed-off-by: Greg Kroah-Hartman

Chengguang Xu
2020-02-11 20:35:25 +0800
b1de9ec0e f2fs: fix miscounted block limit in f2fs_statfs_project() ... Browse Code »

commit acdf2172172a511f97fa21ed0ee7609a6d3b3a07 upstream.

statfs calculates Total/Used/Avail disk space in block unit,
so we should translate soft/hard prjquota limit to block unit
as well.

Below testing result shows the block/inode numbers of
Total/Used/Avail from df command are all correct afer
applying this patch.

[root@localhost quota-tools]\# ./repquota -P /dev/sdb1

Chengguang Xu
2020-02-11 20:35:24 +0800
ae2cb4158 f2fs: choose hardlimit when softlimit is larger than hardlimit in f2fs_statfs_project() ... Browse Code »

commit 909110c060f22e65756659ec6fa957ae75777e00 upstream.

Setting softlimit larger than hardlimit seems meaningless
for disk quota but currently it is allowed. In this case,
there may be a bit of comfusion for users when they run
df comamnd to directory which has project quota.

For example, we set 20M softlimit and 10M hardlimit of
block usage limit for project quota of test_dir(project id 123).

[root@hades f2fs]# repquota -P -a

Chengguang Xu
2020-02-11 20:35:24 +0800
cb33e477a utimes: Clamp the timestamps in notify_change() ... Browse Code »

commit eb31e2f63d85d1bec4f7b136f317e03c03db5503 upstream.

Push clamping timestamps into notify_change(), so in-kernel
callers like nfsd and overlayfs will get similar timestamp
set behavior as utimes.

AV: get rid of clamping in ->setattr() instances; we don't need
to bother with that there, with notify_change() doing normalization
in all cases now (it already did for implicit case, since current_time()
clamps).

Suggested-by: Miklos Szeredi
Fixes: 42e729b9ddbb ("utimes: Clamp the timestamps before update")
Cc: stable@vger.kernel.org # v5.4
Cc: Deepa Dinamani
Cc: Jeff Layton
Signed-off-by: Amir Goldstein
Signed-off-by: Al Viro
Signed-off-by: Greg Kroah-Hartman

Amir Goldstein
2020-02-11 20:35:12 +0800

18 Jan, 2020

1 commit

382e63a56 f2fs: fix potential overflow ... Browse Code »

commit 1f0d5c911b64165c9754139a26c8c2fad352c132 upstream.

We expect 64-bit calculation result from below statement, however
in 32-bit machine, looped left shift operation on pgoff_t type
variable may cause overflow issue, fix it by forcing type cast.

page->index << PAGE_SHIFT;

Fixes: 26de9b117130 ("f2fs: avoid unnecessary updating inode during fsync")
Fixes: 0a2aa8fbb969 ("f2fs: refactor __exchange_data_block for speed up")
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
Signed-off-by: Greg Kroah-Hartman

Chao Yu
2020-01-18 02:49:02 +0800

05 Jan, 2020

3 commits

ce7269497 f2fs: Fix deadlock in f2fs_gc() context during atomic files handling ... Browse Code »

[ Upstream commit 677017d196ba2a4cfff13626b951cc9a206b8c7c ]

The FS got stuck in the below stack when the storage is almost
full/dirty condition (when FG_GC is being done).

schedule_timeout
io_schedule_timeout
congestion_wait
f2fs_drop_inmem_pages_all
f2fs_gc
f2fs_balance_fs
__write_node_page
f2fs_fsync_node_pages
f2fs_do_sync_file
f2fs_ioctl

The root cause for this issue is there is a potential infinite loop
in f2fs_drop_inmem_pages_all() for the case where gc_failure is true
and when there an inode whose i_gc_failures[GC_FAILURE_ATOMIC] is
not set. Fix this by keeping track of the total atomic files
currently opened and using that to exit from this condition.

Fix-suggested-by: Chao Yu
Signed-off-by: Chao Yu
Signed-off-by: Sahitya Tummala
Signed-off-by: Jaegeuk Kim
Signed-off-by: Sasha Levin

Sahitya Tummala
2020-01-05 02:18:18 +0800
bc5de89f6 f2fs: fix to update dir's i_pino during cross_rename ... Browse Code »

[ Upstream commit 2a60637f06ac94869b2e630eaf837110d39bf291 ]

As Eric reported:

RENAME_EXCHANGE support was just added to fsstress in xfstests:

commit 65dfd40a97b6bbbd2a22538977bab355c5bc0f06
Author: kaixuxia
Date: Thu Oct 31 14:41:48 2019 +0800

fsstress: add EXCHANGE renameat2 support

This is causing xfstest generic/579 to fail due to fsck.f2fs reporting errors.
I'm not sure what the problem is, but it still happens even with all the
fs-verity stuff in the test commented out, so that the test just runs fsstress.

generic/579 23s ... [10:02:25]
[ 7.745370] run fstests generic/579 at 2019-11-04 10:02:25
_check_generic_filesystem: filesystem on /dev/vdc is inconsistent
(see /results/f2fs/results-default/generic/579.full for details)
[10:02:47]
Ran: generic/579
Failures: generic/579
Failed 1 of 1 tests
Xunit report: /results/f2fs/results-default/result.xml

Here's the contents of 579.full:

_check_generic_filesystem: filesystem on /dev/vdc is inconsistent
*** fsck.f2fs output ***
[ASSERT] (__chk_dots_dentries:1378) --> Bad inode number[0x24] for '..', parent parent ino is [0xd10]

The root cause is that we forgot to update directory's i_pino during
cross_rename, fix it.

Fixes: 32f9bc25cbda0 ("f2fs: support ->rename2()")
Signed-off-by: Chao Yu
Tested-by: Eric Biggers
Signed-off-by: Jaegeuk Kim
Signed-off-by: Sasha Levin

Chao Yu
2020-01-05 02:17:18 +0800
0d4e226f6 f2fs: fix to update time in lazytime mode ... Browse Code »

[ Upstream commit fe1897eaa6646f5a64a4cee0e6473ed9887d324b ]

generic/018 reports an inconsistent status of atime, the
testcase is as below:
- open file with O_SYNC
- write file to construct fraged space
- calc md5 of file
- record {a,c,m}time
- defrag file --- do nothing
- umount & mount
- check {a,c,m}time

The root cause is, as f2fs enables lazytime by default, atime
update will dirty vfs inode, rather than dirtying f2fs inode (by set
with FI_DIRTY_INODE), so later f2fs_write_inode() called from VFS will
fail to update inode page due to our skip:

f2fs_write_inode()
if (is_inode_flag_set(inode, FI_DIRTY_INODE))
return 0;

So eventually, after evict(), we lose last atime for ever.

To fix this issue, we need to check whether {a,c,m,cr}time is
consistent in between inode cache and inode page, and only skip
f2fs_update_inode() if f2fs inode is not dirty and time is
consistent as well.

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
Signed-off-by: Sasha Levin

Chao Yu
2020-01-05 02:16:31 +0800

22 Sep, 2019

1 commit

fbc246a12 Merge tag 'f2fs-for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs ... Browse Code »

Pull f2fs updates from Jaegeuk Kim:
"In this round, we introduced casefolding support in f2fs, and fixed
various bugs in individual features such as IO alignment,
checkpoint=disable, quota, and swapfile.

Enhancement:
- support casefolding w/ enhancement in ext4
- support fiemap for directory
- support FS_IO_GET|SET_FSLABEL

Bug fix:
- fix IO stuck during checkpoint=disable
- avoid infinite GC loop
- fix panic/overflow related to IO alignment feature
- fix livelock in swap file
- fix discard command leak
- disallow dio for atomic_write"

* tag 'f2fs-for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (51 commits)
f2fs: add a condition to detect overflow in f2fs_ioc_gc_range()
f2fs: fix to add missing F2FS_IO_ALIGNED() condition
f2fs: fix to fallback to buffered IO in IO aligned mode
f2fs: fix to handle error path correctly in f2fs_map_blocks
f2fs: fix extent corrupotion during directIO in LFS mode
f2fs: check all the data segments against all node ones
f2fs: Add a small clarification to CONFIG_FS_F2FS_FS_SECURITY
f2fs: fix inode rwsem regression
f2fs: fix to avoid accessing uninitialized field of inode page in is_alive()
f2fs: avoid infinite GC loop due to stale atomic files
f2fs: Fix indefinite loop in f2fs_gc()
f2fs: convert inline_data in prior to i_size_write
f2fs: fix error path of f2fs_convert_inline_page()
f2fs: add missing documents of reserve_root/resuid/resgid
f2fs: fix flushing node pages when checkpoint is disabled
f2fs: enhance f2fs_is_checkpoint_ready()'s readability
f2fs: clean up __bio_alloc()'s parameter
f2fs: fix wrong error injection path in inc_valid_block_count()
f2fs: fix to writeout dirty inode during node flush
f2fs: optimize case-insensitive lookups
...

Linus Torvalds
2019-09-22 05:26:33 +0800

20 Sep, 2019

1 commit

cfb82e1df Merge tag 'y2038-vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground ... Browse Code »

Pull y2038 vfs updates from Arnd Bergmann:
"Add inode timestamp clamping.

This series from Deepa Dinamani adds a per-superblock minimum/maximum
timestamp limit for a file system, and clamps timestamps as they are
written, to avoid random behavior from integer overflow as well as
having different time stamps on disk vs in memory.

At mount time, a warning is now printed for any file system that can
represent current timestamps but not future timestamps more than 30
years into the future, similar to the arbitrary 30 year limit that was
added to settimeofday().

This was picked as a compromise to warn users to migrate to other file
systems (e.g. ext4 instead of ext3) when they need the file system to
survive beyond 2038 (or similar limits in other file systems), but not
get in the way of normal usage"

* tag 'y2038-vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground:
ext4: Reduce ext4 timestamp warnings
isofs: Initialize filesystem timestamp ranges
pstore: fs superblock limits
fs: omfs: Initialize filesystem timestamp ranges
fs: hpfs: Initialize filesystem timestamp ranges
fs: ceph: Initialize filesystem timestamp ranges
fs: sysv: Initialize filesystem timestamp ranges
fs: affs: Initialize filesystem timestamp ranges
fs: fat: Initialize filesystem timestamp ranges
fs: cifs: Initialize filesystem timestamp ranges
fs: nfs: Initialize filesystem timestamp ranges
ext4: Initialize timestamps limits
9p: Fill min and max timestamps in sb
fs: Fill in max and min timestamps in superblock
utimes: Clamp the timestamps before update
mount: Add mount warning for impending timestamp expiry
timestamp_truncate: Replace users of timespec64_trunc
vfs: Add timestamp_truncate() api
vfs: Add file timestamp range support

Linus Torvalds
2019-09-20 00:42:37 +0800

19 Sep, 2019

1 commit

f60c55a94 Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt ... Browse Code »

Pull fs-verity support from Eric Biggers:
"fs-verity is a filesystem feature that provides Merkle tree based
hashing (similar to dm-verity) for individual readonly files, mainly
for the purpose of efficient authenticity verification.

This pull request includes:

(a) The fs/verity/ support layer and documentation.

(b) fs-verity support for ext4 and f2fs.

Compared to the original fs-verity patchset from last year, the UAPI
to enable fs-verity on a file has been greatly simplified. Lots of
other things were cleaned up too.

fs-verity is planned to be used by two different projects on Android;
most of the userspace code is in place already. Another userspace tool
("fsverity-utils"), and xfstests, are also available. e2fsprogs and
f2fs-tools already have fs-verity support. Other people have shown
interest in using fs-verity too.

I've tested this on ext4 and f2fs with xfstests, both the existing
tests and the new fs-verity tests. This has also been in linux-next
since July 30 with no reported issues except a couple minor ones I
found myself and folded in fixes for.

Ted and I will be co-maintaining fs-verity"

* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
f2fs: add fs-verity support
ext4: update on-disk format documentation for fs-verity
ext4: add fs-verity read support
ext4: add basic fs-verity support
fs-verity: support builtin file signatures
fs-verity: add SHA-512 support
fs-verity: implement FS_IOC_MEASURE_VERITY ioctl
fs-verity: implement FS_IOC_ENABLE_VERITY ioctl
fs-verity: add data verification hooks for ->readpages()
fs-verity: add the hook for file ->setattr()
fs-verity: add the hook for file ->open()
fs-verity: add inode and superblock fields
fs-verity: add Kconfig and the helper functions for hashing
fs: uapi: define verity bit for FS_IOC_GETFLAGS
fs-verity: add UAPI header
fs-verity: add MAINTAINERS file entry
fs-verity: add a documentation file

Linus Torvalds
2019-09-19 07:59:14 +0800

18 Sep, 2019

1 commit

fbbf77998 f2fs: add a condition to detect overflow in f2fs_ioc_gc_range() ... Browse Code »

end = range.start + range.len;

If the range.start/range.len is a very large value, then end can overflow
in this operation. It results into a crash in get_valid_blocks() when
accessing the invalid range.start segno.

This issue is reported in ioctl fuzz testing.

Signed-off-by: Sahitya Tummala
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Sahitya Tummala
2019-09-18 04:56:15 +0800

16 Sep, 2019

9 commits

8223ecc45 f2fs: fix to add missing F2FS_IO_ALIGNED() condition ... Browse Code »

In f2fs_allocate_data_block(), we will reset fio.retry for IO
alignment feature instead of IO serialization feature.

In addition, spread F2FS_IO_ALIGNED() to check IO alignment
feature status explicitly.

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2019-09-16 23:38:49 +0800
9720ee80a f2fs: fix to fallback to buffered IO in IO aligned mode ... Browse Code »

In LFS mode, we allow OPU for direct IO, however, we didn't consider
IO alignment feature, so direct IO can trigger unaligned IO, let's
just fallback to buffered IO to keep correct IO alignment semantics
in all places.

Fixes: f847c699cff3 ("f2fs: allow out-place-update for direct IO in LFS mode")
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2019-09-16 23:38:49 +0800
05e360061 f2fs: fix to handle error path correctly in f2fs_map_blocks ... Browse Code »

In f2fs_map_blocks(), we should bail out once __allocate_data_block()
failed.

Fixes: f847c699cff3 ("f2fs: allow out-place-update for direct IO in LFS mode")
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2019-09-16 23:38:49 +0800
86f35dc39 f2fs: fix extent corrupotion during directIO in LFS mode ... Browse Code »

In LFS mode, por_fsstress testcase reports a bug as below:

[ASSERT] (fsck_chk_inode_blk: 931) --> ino: 0x12fe has wrong ext: [pgofs:142, blk:215424, len:16]

Since commit f847c699cff3 ("f2fs: allow out-place-update for direct
IO in LFS mode"), we start to allow OPU mode for direct IO, however,
we missed to update extent cache in __allocate_data_block(), finally,
it cause extent field being inconsistent with physical block address,
fix it.

Fixes: f847c699cff3 ("f2fs: allow out-place-update for direct IO in LFS mode")
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2019-09-16 23:38:49 +0800
1166c1f2f f2fs: check all the data segments against all node ones ... Browse Code »

As a part of the sanity checking while mounting, distinct segment number
assignment to data and node segments is verified. Fixing a small bug in
this verification between node and data segments. We need to check all
the data segments with all the node segments.

Fixes: 042be0f849e5f ("f2fs: fix to do sanity check with current segment number")
Signed-off-by: Surbhi Palande
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Surbhi Palande
2019-09-16 23:38:48 +0800
bd7253bc5 f2fs: Add a small clarification to CONFIG_FS_F2FS_FS_SECURITY ... Browse Code »

Signed-off-by: Lockywolf
Signed-off-by: Jaegeuk Kim

Lockywolf
2019-09-16 23:38:48 +0800
cb8434f16 f2fs: fix inode rwsem regression ... Browse Code »

This is similar to 942491c9e6d6 ("xfs: fix AIM7 regression")
Apparently our current rwsem code doesn't like doing the trylock, then
lock for real scheme. So change our read/write methods to just do the
trylock for the RWF_NOWAIT case.

We don't need a check for IOCB_NOWAIT and !direct-IO because it
is checked in generic_write_checks().

Fixes: b91050a80cec ("f2fs: add nowait aio support")
Signed-off-by: Goldwyn Rodrigues
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Goldwyn Rodrigues
2019-09-16 23:38:45 +0800
981940305 f2fs: fix to avoid accessing uninitialized field of inode page in is_alive() ... Browse Code »

If inode is newly created, inode page may not synchronize with inode cache,
so fields like .i_inline or .i_extra_isize could be wrong, in below call
path, we may access such wrong fields, result in failing to migrate valid
target block.

Thread A Thread B
- f2fs_create
- f2fs_add_link
- f2fs_add_dentry
- f2fs_init_inode_metadata
- f2fs_add_inline_entry
- f2fs_new_inode_page
- f2fs_put_page
: inode page wasn't updated with inode cache
- gc_data_segment
- is_alive
- f2fs_get_node_page
- datablock_addr
- offset_in_addr
: access uninitialized fields

Fixes: 7a2af766af15 ("f2fs: enhance on-disk inode structure scalability")
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2019-09-16 23:38:26 +0800
743b620cb f2fs: avoid infinite GC loop due to stale atomic files ... Browse Code »

If committing atomic pages is failed when doing f2fs_do_sync_file(), we can
get commited pages but atomic_file being still set like:

- inmem: 0, atomic IO: 4 (Max. 10), volatile IO: 0 (Max. 0)

If GC selects this block, we can get an infinite loop like this:

f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c
f2fs_submit_page_bio: dev = (253,7), ino = 2, page_index = 0x2359a8, oldaddr = 0x2359a8, newaddr = 0x2359a8, rw = READ(), type = COLD_DATA
f2fs_submit_read_bio: dev = (253,7)/(253,7), rw = READ(), DATA, sector = 18533696, size = 4096
f2fs_get_victim: dev = (253,7), type = No TYPE, policy = (Foreground GC, LFS-mode, Greedy), victim = 4355, cost = 1, ofs_unit = 1, pre_victim_secno = 4355, prefree = 0, free = 234
f2fs_iget: dev = (253,7), ino = 6247, pino = 5845, i_mode = 0x81b0, i_size = 319488, i_nlink = 1, i_blocks = 624, i_advise = 0x2c

In that moment, we can observe:

[Before]
Try to move 5084219 blocks (BG: 384508)
- data blocks : 4962373 (274483)
- node blocks : 121846 (110025)
Skipped : atomic write 4534686 (10)

[After]
Try to move 5088973 blocks (BG: 384508)
- data blocks : 4967127 (274483)
- node blocks : 121846 (110025)
Skipped : atomic write 4539440 (10)

So, refactor atomic_write flow like this:
1. start_atomic_write
- add inmem_list and set atomic_file

2. write()
- register it in inmem_pages

3. commit_atomic_write
- if no error, f2fs_drop_inmem_pages()
- f2fs_commit_inmme_pages() failed
: __revoked_inmem_pages() was done
- f2fs_do_sync_file failed
: abort_atomic_write later

4. abort_atomic_write
- f2fs_drop_inmem_pages

5. f2fs_drop_inmem_pages
- clear atomic_file
- remove inmem_list

Based on this change, when GC fails to move block in atomic_file,
f2fs_drop_inmem_pages_all() can call f2fs_drop_inmem_pages().

Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2019-09-16 23:38:20 +0800

09 Sep, 2019

1 commit

957fa4782 f2fs: Fix indefinite loop in f2fs_gc() ... Browse Code »

Policy - foreground GC, LFS mode and greedy GC mode.

Under this policy, f2fs_gc() loops forever to GC as it doesn't have
enough free segements to proceed and thus it keeps calling gc_more
for the same victim segment. This can happen if the selected victim
segment could not be GC'd due to failed blkaddr validity check i.e.
is_alive() returns false for the blocks set in current validity map.

Fix this by not resetting the sbi->cur_victim_sec to NULL_SEGNO, when
the segment selected could not be GC'd. This helps to select another
segment for GC and thus helps to proceed forward with GC.

[Note]
This can happen due to is_alive as well as atomic_file which skipps
GC.

Signed-off-by: Sahitya Tummala
Signed-off-by: Jaegeuk Kim

Sahitya Tummala
2019-09-09 20:06:11 +0800

07 Sep, 2019

8 commits

cfb9a34d1 f2fs: convert inline_data in prior to i_size_write ... Browse Code »

In below call path, we change i_size before inline conversion, however,
if we failed to convert inline inode, the inode may have wrong i_size
which is larger than max inline size, result inline inode corruption.

- f2fs_setattr
- truncate_setsize
- f2fs_convert_inline_inode

This patch reorders truncate_setsize() and f2fs_convert_inline_inode()
to guarantee inline_data has valid i_size.

Fixes: 0cab80ee0c9e ("f2fs: fix to convert inline inode in ->setattr")
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2019-09-07 07:18:27 +0800
e8c82c11c f2fs: fix error path of f2fs_convert_inline_page() ... Browse Code »

In error path of f2fs_convert_inline_page(), we missed to truncate newly
reserved block in .i_addrs[0] once we failed in get_node_info(), fix it.

Fixes: 7735730d39d7 ("f2fs: fix to propagate error from __get_meta_page()")
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2019-09-07 07:18:27 +0800
100c06554 f2fs: fix flushing node pages when checkpoint is disabled ... Browse Code »

This patch fixes skipping node page writes when checkpoint is disabled.
In this period, we can't rely on checkpoint to flush node pages.

Fixes: fd8c8caf7e7c ("f2fs: let checkpoint flush dnode page of regular")
Fixes: 4354994f097d ("f2fs: checkpoint disabling")
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2019-09-07 07:18:26 +0800
00e09c0bc f2fs: enhance f2fs_is_checkpoint_ready()'s readability ... Browse Code »

This patch changes sematics of f2fs_is_checkpoint_ready()'s return
value as: return true when checkpoint is ready, other return false,
it can improve readability of below conditions.

f2fs_submit_page_write()
...
if (is_sbi_flag_set(sbi, SBI_IS_SHUTDOWN) ||
!f2fs_is_checkpoint_ready(sbi))
__submit_merged_bio(io);

f2fs_balance_fs()
...
if (!f2fs_is_checkpoint_ready(sbi))
return;

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2019-09-07 07:18:26 +0800
b757f6edb f2fs: clean up __bio_alloc()'s parameter ... Browse Code »

Just cleanup, no logic change.

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2019-09-07 07:18:26 +0800
9ea2f0be6 f2fs: fix wrong error injection path in inc_valid_block_count() ... Browse Code »

If FAULT_BLOCK type error injection is on, in inc_valid_block_count()
we may decrease sbi->alloc_valid_block_count percpu stat count
incorrectly, fix it.

Fixes: 36b877af7992 ("f2fs: Keep alloc_valid_block_count in sync")
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2019-09-07 07:18:26 +0800
052a82d85 f2fs: fix to writeout dirty inode during node flush ... Browse Code »

As Eric reported:

On xfstest generic/204 on f2fs, I'm getting a kernel BUG.

allocate_segment_by_default+0x9d/0x100 [f2fs]
f2fs_allocate_data_block+0x3c0/0x5c0 [f2fs]
do_write_page+0x62/0x110 [f2fs]
f2fs_do_write_node_page+0x2b/0xa0 [f2fs]
__write_node_page+0x2ec/0x590 [f2fs]
f2fs_sync_node_pages+0x756/0x7e0 [f2fs]
block_operations+0x25b/0x350 [f2fs]
f2fs_write_checkpoint+0x104/0x1150 [f2fs]
f2fs_sync_fs+0xa2/0x120 [f2fs]
f2fs_balance_fs_bg+0x33c/0x390 [f2fs]
f2fs_write_node_pages+0x4c/0x1f0 [f2fs]
do_writepages+0x1c/0x70
__writeback_single_inode+0x45/0x320
writeback_sb_inodes+0x273/0x5c0
wb_writeback+0xff/0x2e0
wb_workfn+0xa1/0x370
process_one_work+0x138/0x350
worker_thread+0x4d/0x3d0
kthread+0x109/0x140

The root cause of this issue is, in a very small partition, e.g.
in generic/204 testcase of fstest suit, filesystem's free space
is 50MB, so at most we can write 12800 inline inode with command:
`echo XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX > $SCRATCH_MNT/$i`,
then filesystem will have:
- 12800 dirty inline data page
- 12800 dirty inode page
- and 12800 dirty imeta (dirty inode)

When we flush node-inode's page cache, we can also flush inline
data with each inode page, however it will run out-of-free-space
in device, then once it triggers checkpoint, there is no room for
huge number of imeta, at this time, GC is useless, as there is no
dirty segment at all.

In order to fix this, we try to recognize inode page during
node_inode's page flushing, and update inode page from dirty inode,
so that later another imeta (dirty inode) flush can be avoided.

Reported-and-tested-by: Eric Biggers
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2019-09-07 07:18:26 +0800
950d47f23 f2fs: optimize case-insensitive lookups ... Browse Code »

This patch ports below casefold enhancement patch from ext4 to f2fs

commit 3ae72562ad91 ("ext4: optimize case-insensitive lookups")

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2019-09-07 07:18:12 +0800

30 Aug, 2019

1 commit

3818c1907 timestamp_truncate: Replace users of timespec64_trunc ... Browse Code »

Update the inode timestamp updates to use timestamp_truncate()
instead of timespec64_trunc().

The change was mostly generated by the following coccinelle
script.

virtual context
virtual patch

@r1 depends on patch forall@
struct inode *inode;
identifier i_xtime =~ "^i_[acm]time$";
expression e;
@@

inode->i_xtime =
- timespec64_trunc(
+ timestamp_truncate(
...,
- e);
+ inode);

Signed-off-by: Deepa Dinamani
Acked-by: Greg Kroah-Hartman
Acked-by: Jeff Layton
Cc: adrian.hunter@intel.com
Cc: dedekind1@gmail.com
Cc: gregkh@linuxfoundation.org
Cc: hch@lst.de
Cc: jaegeuk@kernel.org
Cc: jlbec@evilplan.org
Cc: richard@nod.at
Cc: tj@kernel.org
Cc: yuchao0@huawei.com
Cc: linux-f2fs-devel@lists.sourceforge.net
Cc: linux-ntfs-dev@lists.sourceforge.net
Cc: linux-mtd@lists.infradead.org

Deepa Dinamani
2019-08-30 22:27:17 +0800

23 Aug, 2019

1 commit

fe76a166a f2fs: introduce f2fs_match_name() for cleanup ... Browse Code »

This patch introduces f2fs_match_name() for cleanup.

BTW, it avoids to fallback to normal comparison once it doesn't
match casefolded name.

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2019-08-23 22:57:15 +0800