Eric Lee / smarc-fsl-linux-kernel

06 Jan, 2021

1 commit

1c5a03471 f2fs: fix shift-out-of-bounds in sanity_check_raw_super() ... Browse Code »

commit e584bbe821229a3e7cc409eecd51df66f9268c21 upstream.

syzbot reported a bug which could cause shift-out-of-bounds issue,
fix it.

Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x107/0x163 lib/dump_stack.c:120
ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
__ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
sanity_check_raw_super fs/f2fs/super.c:2812 [inline]
read_raw_super_block fs/f2fs/super.c:3267 [inline]
f2fs_fill_super.cold+0x16c9/0x16f6 fs/f2fs/super.c:3519
mount_bdev+0x34d/0x410 fs/super.c:1366
legacy_get_tree+0x105/0x220 fs/fs_context.c:592
vfs_get_tree+0x89/0x2f0 fs/super.c:1496
do_new_mount fs/namespace.c:2896 [inline]
path_mount+0x12ae/0x1e70 fs/namespace.c:3227
do_mount fs/namespace.c:3240 [inline]
__do_sys_mount fs/namespace.c:3448 [inline]
__se_sys_mount fs/namespace.c:3425 [inline]
__x64_sys_mount+0x27f/0x300 fs/namespace.c:3425
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reported-by: syzbot+ca9a785f8ac472085994@syzkaller.appspotmail.com
Signed-off-by: Anant Thazhemadam
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
Signed-off-by: Greg Kroah-Hartman

Chao Yu
2021-01-06 21:56:52 +0800

30 Dec, 2020

1 commit

9c14fb58a f2fs: fix double free of unicode map ... Browse Code »

[ Upstream commit 89ff6005039a878afac87889fee748fa3f957c3a ]

In case of retrying fill_super with skip_recovery,
s_encoding for casefold would not be loaded again even though it's
already been freed because it's not NULL.
Set NULL after free to prevent double freeing when unmount.

Fixes: eca4873ee1b6 ("f2fs: Use generic casefolding support")
Signed-off-by: Hyeongseok Kim
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim
Signed-off-by: Sasha Levin

Hyeongseok Kim
2020-12-30 18:53:31 +0800

25 Oct, 2020

1 commit

0eac1102e Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull misc vfs updates from Al Viro:
"Assorted stuff all over the place (the largest group here is
Christoph's stat cleanups)"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: remove KSTAT_QUERY_FLAGS
fs: remove vfs_stat_set_lookup_flags
fs: move vfs_fstatat out of line
fs: implement vfs_stat and vfs_lstat in terms of vfs_fstatat
fs: remove vfs_statx_fd
fs: omfs: use kmemdup() rather than kmalloc+memcpy
[PATCH] reduce boilerplate in fsid handling
fs: Remove duplicated flag O_NDELAY occurring twice in VALID_OPEN_FLAGS
selftests: mount: add nosymfollow tests
Add a "nosymfollow" mount option.

Linus Torvalds
2020-10-25 03:26:05 +0800

17 Oct, 2020

1 commit

7a3dadedc Merge tag 'f2fs-for-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs ... Browse Code »

Pull f2fs updates from Jaegeuk Kim:
"In this round, we've added new features such as zone capacity for ZNS
and a new GC policy, ATGC, along with in-memory segment management. In
addition, we could improve the decompression speed significantly by
changing virtual mapping method. Even though we've fixed lots of small
bugs in compression support, I feel that it becomes more stable so
that I could give it a try in production.

Enhancements:
- suport zone capacity in NVMe Zoned Namespace devices
- introduce in-memory current segment management
- add standart casefolding support
- support age threshold based garbage collection
- improve decompression speed by changing virtual mapping method

Bug fixes:
- fix condition checks in some ioctl() such as compression, move_range, etc
- fix 32/64bits support in data structures
- fix memory allocation in zstd decompress
- add some boundary checks to avoid kernel panic on corrupted image
- fix disallowing compression for non-empty file
- fix slab leakage of compressed block writes

In addition, it includes code refactoring for better readability and
minor bug fixes for compression and zoned device support"

* tag 'f2fs-for-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (51 commits)
f2fs: code cleanup by removing unnecessary check
f2fs: wait for sysfs kobject removal before freeing f2fs_sb_info
f2fs: fix writecount false positive in releasing compress blocks
f2fs: introduce check_swap_activate_fast()
f2fs: don't issue flush in f2fs_flush_device_cache() for nobarrier case
f2fs: handle errors of f2fs_get_meta_page_nofail
f2fs: fix to set SBI_NEED_FSCK flag for inconsistent inode
f2fs: reject CASEFOLD inode flag without casefold feature
f2fs: fix memory alignment to support 32bit
f2fs: fix slab leak of rpages pointer
f2fs: compress: fix to disallow enabling compress on non-empty file
f2fs: compress: introduce cic/dic slab cache
f2fs: compress: introduce page array slab cache
f2fs: fix to do sanity check on segment/section count
f2fs: fix to check segment boundary during SIT page readahead
f2fs: fix uninit-value in f2fs_lookup
f2fs: remove unneeded parameter in find_in_block()
f2fs: fix wrong total_sections check and fsmeta check
f2fs: remove duplicated code in sanity_check_area_boundary
f2fs: remove unused check on version_bitmap
...

Linus Torvalds
2020-10-17 06:14:43 +0800

30 Sep, 2020

2 commits

c68d6c883 f2fs: compress: introduce cic/dic slab cache ... Browse Code »

Add two slab caches: "f2fs_cic_entry" and "f2fs_dic_entry" for memory
allocation of compress_io_ctx and decompress_io_ctx structure.

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2020-09-30 00:16:36 +0800
310830317 f2fs: compress: introduce page array slab cache ... Browse Code »

Add a per-sbi slab cache "f2fs_page_array_entry-%u:%u" for memory
allocation of page pointer array in compress context.

Signed-off-by: Chao Yu
[Jaegeuk Kim: Fix wrong memory allocation]
Signed-off-by: Jaegeuk Kim

Chao Yu
2020-09-30 00:16:32 +0800

29 Sep, 2020

5 commits

3a22e9ac7 f2fs: fix to do sanity check on segment/section count ... Browse Code »

As syzbot reported:

BUG: KASAN: slab-out-of-bounds in init_min_max_mtime fs/f2fs/segment.c:4710 [inline]
BUG: KASAN: slab-out-of-bounds in f2fs_build_segment_manager+0x9302/0xa6d0 fs/f2fs/segment.c:4792
Read of size 8 at addr ffff8880a1b934a8 by task syz-executor682/6878

CPU: 1 PID: 6878 Comm: syz-executor682 Not tainted 5.9.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x198/0x1fd lib/dump_stack.c:118
print_address_description.constprop.0.cold+0xae/0x497 mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513 [inline]
kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
init_min_max_mtime fs/f2fs/segment.c:4710 [inline]
f2fs_build_segment_manager+0x9302/0xa6d0 fs/f2fs/segment.c:4792
f2fs_fill_super+0x381a/0x6e80 fs/f2fs/super.c:3633
mount_bdev+0x32e/0x3f0 fs/super.c:1417
legacy_get_tree+0x105/0x220 fs/fs_context.c:592
vfs_get_tree+0x89/0x2f0 fs/super.c:1547
do_new_mount fs/namespace.c:2875 [inline]
path_mount+0x1387/0x20a0 fs/namespace.c:3192
do_mount fs/namespace.c:3205 [inline]
__do_sys_mount fs/namespace.c:3413 [inline]
__se_sys_mount fs/namespace.c:3390 [inline]
__x64_sys_mount+0x27f/0x300 fs/namespace.c:3390
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

The root cause is: if segs_per_sec is larger than one, and segment count
in last section is less than segs_per_sec, we will suffer out-of-boundary
memory access on sit_i->sentries[] in init_min_max_mtime().

Fix this by adding sanity check among segment count, section count and
segs_per_sec value in sanity_check_raw_super().

Reported-by: syzbot+481a3ffab50fed41dcc0@syzkaller.appspotmail.com
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2020-09-29 17:12:41 +0800
f99ba9add f2fs: fix wrong total_sections check and fsmeta check ... Browse Code »

Meta area is not included in section_count computation.
So the minimum number of total_sections is 1 meanwhile it cannot be
greater than segment_count_main.

The minimum number of meta segments is 8 (SB + 2 (CP + SIT + NAT) + SSA).

Signed-off-by: Wang Xiaojun
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Wang Xiaojun
2020-09-29 16:48:34 +0800
d89f58913 f2fs: remove duplicated code in sanity_check_area_boundary ... Browse Code »

Use seg_end_blkaddr instead of "segment0_blkaddr + (segment_count <<
log_blocks_per_seg)".

Signed-off-by: Wang Xiaojun
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Wang Xiaojun
2020-09-29 16:48:34 +0800
d0660122d f2fs: relocate blkzoned feature check ... Browse Code »

Relocate blkzoned feature check into parse_options() like
other feature check.

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2020-09-29 16:48:33 +0800
07eb1d699 f2fs: do sanity check on zoned block device path ... Browse Code »

sbi->devs would be initialized only if image enables multiple device
feature or blkzoned feature, if blkzoned feature flag was set by fuzz
in non-blkzoned device, we will suffer below panic:

get_zone_idx fs/f2fs/segment.c:4892 [inline]
f2fs_usable_zone_blks_in_seg fs/f2fs/segment.c:4943 [inline]
f2fs_usable_blks_in_seg+0x39b/0xa00 fs/f2fs/segment.c:4999
Call Trace:
check_block_count+0x69/0x4e0 fs/f2fs/segment.h:704
build_sit_entries fs/f2fs/segment.c:4403 [inline]
f2fs_build_segment_manager+0x51da/0xa370 fs/f2fs/segment.c:5100
f2fs_fill_super+0x3880/0x6ff0 fs/f2fs/super.c:3684
mount_bdev+0x32e/0x3f0 fs/super.c:1417
legacy_get_tree+0x105/0x220 fs/fs_context.c:592
vfs_get_tree+0x89/0x2f0 fs/super.c:1547
do_new_mount fs/namespace.c:2896 [inline]
path_mount+0x12ae/0x1e70 fs/namespace.c:3216
do_mount fs/namespace.c:3229 [inline]
__do_sys_mount fs/namespace.c:3437 [inline]
__se_sys_mount fs/namespace.c:3414 [inline]
__x64_sys_mount+0x27f/0x300 fs/namespace.c:3414
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46

Add sanity check to inconsistency on factors: blkzoned flag, device
path and device character to avoid above panic.

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2020-09-29 16:48:33 +0800

22 Sep, 2020

2 commits

c8c868abc fscrypt: make fscrypt_set_test_dummy_encryption() take a 'const char *' ... Browse Code »

fscrypt_set_test_dummy_encryption() requires that the optional argument
to the test_dummy_encryption mount option be specified as a substring_t.
That doesn't work well with filesystems that use the new mount API,
since the new way of parsing mount options doesn't use substring_t.

Make it take the argument as a 'const char *' instead.

Instead of moving the match_strdup() into the callers in ext4 and f2fs,
make them just use arg->from directly. Since the pattern is
"test_dummy_encryption=%s", the argument will be null-terminated.

Acked-by: Jeff Layton
Link: https://lore.kernel.org/r/20200917041136.178600-14-ebiggers@kernel.org
Signed-off-by: Eric Biggers

Eric Biggers
2020-09-22 21:48:52 +0800
ac4acb1f4 fscrypt: handle test_dummy_encryption in more logical way ... Browse Code »

The behavior of the test_dummy_encryption mount option is that when a
new file (or directory or symlink) is created in an unencrypted
directory, it's automatically encrypted using a dummy encryption policy.
That's it; in particular, the encryption (or lack thereof) of existing
files (or directories or symlinks) doesn't change.

Unfortunately the implementation of test_dummy_encryption is a bit weird
and confusing. When test_dummy_encryption is enabled and a file is
being created in an unencrypted directory, we set up an encryption key
(->i_crypt_info) for the directory. This isn't actually used to do any
encryption, however, since the directory is still unencrypted! Instead,
->i_crypt_info is only used for inheriting the encryption policy.

One consequence of this is that the filesystem ends up providing a
"dummy context" (policy + nonce) instead of a "dummy policy". In
commit ed318a6cc0b6 ("fscrypt: support test_dummy_encryption=v2"), I
mistakenly thought this was required. However, actually the nonce only
ends up being used to derive a key that is never used.

Another consequence of this implementation is that it allows for
'inode->i_crypt_info != NULL && !IS_ENCRYPTED(inode)', which is an edge
case that can be forgotten about. For example, currently
FS_IOC_GET_ENCRYPTION_POLICY on an unencrypted directory may return the
dummy encryption policy when the filesystem is mounted with
test_dummy_encryption. That seems like the wrong thing to do, since
again, the directory itself is not actually encrypted.

Therefore, switch to a more logical and maintainable implementation
where the dummy encryption policy inheritance is done without setting up
keys for unencrypted directories. This involves:

- Adding a function fscrypt_policy_to_inherit() which returns the
encryption policy to inherit from a directory. This can be a real
policy, a dummy policy, or no policy.

- Replacing struct fscrypt_dummy_context, ->get_dummy_context(), etc.
with struct fscrypt_dummy_policy, ->get_dummy_policy(), etc.

- Making fscrypt_fname_encrypted_size() take an fscrypt_policy instead
of an inode.

Acked-by: Jaegeuk Kim
Acked-by: Jeff Layton
Link: https://lore.kernel.org/r/20200917041136.178600-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers

Eric Biggers
2020-09-22 21:48:49 +0800

19 Sep, 2020

1 commit

6d1349c76 [PATCH] reduce boilerplate in fsid handling ... Browse Code »

Get rid of boilerplate in most of ->statfs()
instances...

Signed-off-by: Al Viro

Al Viro
2020-09-19 04:45:50 +0800

12 Sep, 2020

3 commits

c2759ebaf f2fs: change i_compr_blocks of inode to atomic value ... Browse Code »

writepages() can be concurrently invoked for the same file by different
threads such as a thread fsyncing the file and a kworker kernel thread.
So, changing i_compr_blocks without protection is racy and we need to
protect it by changing it with atomic type value. Plus, we don't need
a 64bit value for i_compr_blocks, so just we will use a atomic value,
not atomic64.

Signed-off-by: Daeho Jeong
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Daeho Jeong
2020-09-12 02:11:26 +0800
69c0dd29f f2fs: ignore compress mount option on image w/o compression feature ... Browse Code »

to keep consistent with behavior when passing compress mount option
to kernel w/o compression feature, so that mount may not fail on
such condition.

Reported-by: Kyungmin Park
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2020-09-12 02:11:25 +0800
093749e29 f2fs: support age threshold based garbage collection ... Browse Code »

There are several issues in current background GC algorithm:
- valid blocks is one of key factors during cost overhead calculation,
so if segment has less valid block, however even its age is young or
it locates hot segment, CB algorithm will still choose the segment as
victim, it's not appropriate.
- GCed data/node will go to existing logs, no matter in-there datas'
update frequency is the same or not, it may mix hot and cold data
again.
- GC alloctor mainly use LFS type segment, it will cost free segment
more quickly.

This patch introduces a new algorithm named age threshold based
garbage collection to solve above issues, there are three steps
mainly:

1. select a source victim:
- set an age threshold, and select candidates beased threshold:
e.g.
0 means youngest, 100 means oldest, if we set age threshold to 80
then select dirty segments which has age in range of [80, 100] as
candiddates;
- set candidate_ratio threshold, and select candidates based the
ratio, so that we can shrink candidates to those oldest segments;
- select target segment with fewest valid blocks in order to
migrate blocks with minimum cost;

2. select a target victim:
- select candidates beased age threshold;
- set candidate_radius threshold, search candidates whose age is
around source victims, searching radius should less than the
radius threshold.
- select target segment with most valid blocks in order to avoid
migrating current target segment.

3. merge valid blocks from source victim into target victim with
SSR alloctor.

Test steps:
- create 160 dirty segments:
* half of them have 128 valid blocks per segment
* left of them have 384 valid blocks per segment
- run background GC

Benefit: GC count and block movement count both decrease obviously:

- Before:
- Valid: 86
- Dirty: 1
- Prefree: 11
- Free: 6001 (6001)

GC calls: 162 (BG: 220)
- data segments : 160 (160)
- node segments : 2 (2)
Try to move 41454 blocks (BG: 41454)
- data blocks : 40960 (40960)
- node blocks : 494 (494)

IPU: 0 blocks
SSR: 0 blocks in 0 segments
LFS: 41364 blocks in 81 segments

- After:

- Valid: 87
- Dirty: 0
- Prefree: 4
- Free: 6008 (6008)

GC calls: 75 (BG: 76)
- data segments : 74 (74)
- node segments : 1 (1)
Try to move 12813 blocks (BG: 12813)
- data blocks : 12544 (12544)
- node blocks : 269 (269)

IPU: 0 blocks
SSR: 12032 blocks in 77 segments
LFS: 855 blocks in 2 segments

Signed-off-by: Chao Yu
[Jaegeuk Kim: fix a bug along with pinfile in-mem segment & clean up]
Signed-off-by: Jaegeuk Kim

Chao Yu
2020-09-12 02:11:15 +0800

11 Sep, 2020

3 commits

eca4873ee f2fs: Use generic casefolding support ... Browse Code »

This switches f2fs over to the generic support provided in
the previous patch.

Since casefolded dentries behave the same in ext4 and f2fs, we decrease
the maintenance burden by unifying them, and any optimizations will
immediately apply to both.

Signed-off-by: Daniel Rosenberg
Reviewed-by: Eric Biggers
Signed-off-by: Jaegeuk Kim

Daniel Rosenberg
2020-09-11 05:03:31 +0800
d0b9e42ab f2fs: introduce inmem curseg ... Browse Code »

Previous implementation of aligned pinfile allocation will:
- allocate new segment on cold data log no matter whether last used
segment is partially used or not, it makes IOs more random;
- force concurrent cold data/GCed IO going into warm data area, it
can make a bad effect on hot/cold data separation;

In this patch, we introduce a new type of log named 'inmem curseg',
the differents from normal curseg is:
- it reuses existed segment type (CURSEG_XXX_NODE/DATA);
- it only exists in memory, its segno, blkofs, summary will not b
persisted into checkpoint area;

With this new feature, we can enhance scalability of log, special
allocators can be created for purposes:
- pure lfs allocator for aligned pinfile allocation or file
defragmentation
- pure ssr allocator for later feature

So that, let's update aligned pinfile allocation to use this new
inmem curseg fwk.

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2020-09-11 05:03:30 +0800
de881df97 f2fs: support zone capacity less than zone size ... Browse Code »

NVMe Zoned Namespace devices can have zone-capacity less than zone-size.
Zone-capacity indicates the maximum number of sectors that are usable in
a zone beginning from the first sector of the zone. This makes the sectors
sectors after the zone-capacity till zone-size to be unusable.
This patch set tracks zone-size and zone-capacity in zoned devices and
calculate the usable blocks per segment and usable segments per section.

If zone-capacity is less than zone-size mark only those segments which
start before zone-capacity as free segments. All segments at and beyond
zone-capacity are treated as permanently used segments. In cases where
zone-capacity does not align with segment size the last segment will start
before zone-capacity and end beyond the zone-capacity of the zone. For
such spanning segments only sectors within the zone-capacity are used.

During writes and GC manage the usable segments in a section and usable
blocks per segment. Segments which are beyond zone-capacity are never
allocated, and do not need to be garbage collected, only the segments
which are before zone-capacity needs to garbage collected.
For spanning segments based on the number of usable blocks in that
segment, write to blocks only up to zone-capacity.

Zone-capacity is device specific and cannot be configured by the user.
Since NVMe ZNS device zones are sequentially write only, a block device
with conventional zones or any normal block device is needed along with
the ZNS device for the metadata operations of F2fs.

A typical nvme-cli output of a zoned device shows zone start and capacity
and write pointer as below:

SLBA: 0x0 WP: 0x0 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
SLBA: 0x20000 WP: 0x20000 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ
SLBA: 0x40000 WP: 0x40000 Cap: 0x18800 State: EMPTY Type: SEQWRITE_REQ

Here zone size is 64MB, capacity is 49MB, WP is at zone start as the zones
are in EMPTY state. For each zone, only zone start + 49MB is usable area,
any lba/sector after 49MB cannot be read or written to, the drive will fail
any attempts to read/write. So, the second zone starts at 64MB and is
usable till 113MB (64 + 49) and the range between 113 and 128MB is
again unusable. The next zone starts at 128MB, and so on.

Signed-off-by: Aravind Ramesh
Signed-off-by: Damien Le Moal
Signed-off-by: Niklas Cassel
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Aravind Ramesh
2020-09-11 05:03:29 +0800

11 Aug, 2020

1 commit

086ba2ec1 Merge tag 'f2fs-for-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs ... Browse Code »

Pull f2fs updates from Jaegeuk Kim:
"In this round, we've added two small interfaces: (a) GC_URGENT_LOW
mode for performance and (b) F2FS_IOC_SEC_TRIM_FILE ioctl for
security.

The new GC mode allows Android to run some lower priority GCs in
background, while new ioctl discards user information without race
condition when the account is removed.

In addition, some patches were merged to address latency-related
issues. We've fixed some compression-related bug fixes as well as edge
race conditions.

Enhancements:
- add GC_URGENT_LOW mode in gc_urgent
- introduce F2FS_IOC_SEC_TRIM_FILE ioctl
- bypass racy readahead to improve read latencies
- shrink node_write lock coverage to avoid long latency

Bug fixes:
- fix missing compression flag control, i_size, and mount option
- fix deadlock between quota writes and checkpoint
- remove inode eviction path in synchronous path to avoid deadlock
- fix to wait GCed compressed page writeback
- fix a kernel panic in f2fs_is_compressed_page
- check page dirty status before writeback
- wait page writeback before update in node page write flow
- fix a race condition between f2fs_write_end_io and f2fs_del_fsync_node_entry

We've added some minor sanity checks and refactored trivial code
blocks for better readability and debugging information"

* tag 'f2fs-for-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (52 commits)
f2fs: prepare a waiter before entering io_schedule
f2fs: update_sit_entry: Make the judgment condition of f2fs_bug_on more intuitive
f2fs: replace test_and_set/clear_bit() with set/clear_bit()
f2fs: make file immutable even if releasing zero compression block
f2fs: compress: disable compression mount option if compression is off
f2fs: compress: add sanity check during compressed cluster read
f2fs: use macro instead of f2fs verity version
f2fs: fix deadlock between quota writes and checkpoint
f2fs: correct comment of f2fs_exist_written_data
f2fs: compress: delay temp page allocation
f2fs: compress: fix to update isize when overwriting compressed file
f2fs: space related cleanup
f2fs: fix use-after-free issue
f2fs: Change the type of f2fs_flush_inline_data() to void
f2fs: add F2FS_IOC_SEC_TRIM_FILE ioctl
f2fs: should avoid inode eviction in synchronous path
f2fs: segment.h: delete a duplicated word
f2fs: compress: fix to avoid memory leak on cc->cpages
f2fs: use generic names for generic ioctls
f2fs: don't keep meta inode pages used for compressed block migration
...

Linus Torvalds
2020-08-11 09:33:22 +0800

04 Aug, 2020

1 commit

1f0b067b6 f2fs: compress: disable compression mount option if compression is off ... Browse Code »

If CONFIG_F2FS_FS_COMPRESSION is off, don't allow to configure or
show compression related mount option.

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2020-08-04 01:32:52 +0800

24 Jul, 2020

1 commit

99c787cfd f2fs: fix use-after-free issue ... Browse Code »

During umount, f2fs_put_super() unregisters procfs entries after
f2fs_destroy_segment_manager(), it may cause use-after-free
issue when umount races with procfs accessing, fix it by relocating
f2fs_unregister_sysfs().

[Chao Yu: change commit title/message a bit]

Signed-off-by: Li Guifu
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Li Guifu
2020-07-24 11:22:42 +0800

09 Jul, 2020

1 commit

27aacd28e f2fs: add inline encryption support ... Browse Code »

Wire up f2fs to support inline encryption via the helper functions which
fs/crypto/ now provides. This includes:

- Adding a mount option 'inlinecrypt' which enables inline encryption
on encrypted files where it can be used.

- Setting the bio_crypt_ctx on bios that will be submitted to an
inline-encrypted file.

- Not adding logically discontiguous data to bios that will be submitted
to an inline-encrypted file.

- Not doing filesystem-layer crypto on inline-encrypted files.

This patch includes a fix for a race during IPU by
Sahitya Tummala

Signed-off-by: Satya Tangirala
Acked-by: Jaegeuk Kim
Reviewed-by: Eric Biggers
Reviewed-by: Chao Yu
Link: https://lore.kernel.org/r/20200702015607.1215430-4-satyat@google.com
Co-developed-by: Eric Biggers
Signed-off-by: Eric Biggers

Satya Tangirala
2020-07-09 01:29:43 +0800

08 Jul, 2020

1 commit

6b12367da f2fs: avoid readahead race condition ... Browse Code »

If two readahead threads having same offset enter in readpages, every read
IOs are split and issued to the disk which giving lower bandwidth.

This patch tries to avoid redundant readahead calls.

Fixes one build error reported by Randy.
Fix build error when F2FS_FS_COMPRESSION is not set/enabled.
This label is needed in either case.

../fs/f2fs/data.c: In function ‘f2fs_mpage_readpages’:
../fs/f2fs/data.c:2327:5: error: label ‘next_page’ used but not defined
goto next_page;

Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2020-07-08 12:51:48 +0800

19 Jun, 2020

2 commits

ba87a45c2 f2fs: use kfree() to free variables allocated by match_strdup() ... Browse Code »

Use kfree() instead of kvfree() to free variables allocated
by match_strdup(). Because the memory is allocated with kmalloc
inside match_strdup().

Signed-off-by: Wang Xiaojun
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Wang Xiaojun
2020-06-19 03:37:47 +0800
742532d11 f2fs: use kfree() instead of kvfree() to free superblock data ... Browse Code »

Use kfree() instead of kvfree() to free super in read_raw_super_block()
because the memory is allocated with kzalloc() in the function.
Use kfree() instead of kvfree() to free sbi, raw_super in
f2fs_fill_super() and f2fs_put_super() because the memory is allocated
with kzalloc().

Signed-off-by: Denis Efremov
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Denis Efremov
2020-06-19 03:33:18 +0800

10 Jun, 2020

1 commit

42612e776 Merge tag 'f2fs-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs ... Browse Code »

Pull f2fs updates from Jaegeuk Kim:
"In this round, we've added some knobs to enhance compression feature
and harden testing environment. In addition, we've fixed several bugs
reported from Android devices such as long discarding latency, device
hanging during quota_sync, etc.

Enhancements:
- support lzo-rle algorithm
- add two ioctls to release and reserve blocks for compression
- support partial truncation/fiemap on compressed file
- introduce sysfs entries to attach IO flags explicitly
- add iostat trace point along with read io stat

Bug fixes:
- fix long discard latency
- flush quota data by f2fs_quota_sync correctly
- fix to recover parent inode number for power-cut recovery
- fix lz4/zstd output buffer budget
- parse checkpoint mount option correctly
- avoid inifinite loop to wait for flushing node/meta pages
- manage discard space correctly

And some refactoring and clean up patches were added"

* tag 'f2fs-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (51 commits)
f2fs: attach IO flags to the missing cases
f2fs: add node_io_flag for bio flags likewise data_io_flag
f2fs: remove unused parameter of f2fs_put_rpages_mapping()
f2fs: handle readonly filesystem in f2fs_ioc_shutdown()
f2fs: avoid utf8_strncasecmp() with unstable name
f2fs: don't return vmalloc() memory from f2fs_kmalloc()
f2fs: fix retry logic in f2fs_write_cache_pages()
f2fs: fix wrong discard space
f2fs: compress: don't compress any datas after cp stop
f2fs: remove unneeded return value of __insert_discard_tree()
f2fs: fix wrong value of tracepoint parameter
f2fs: protect new segment allocation in expand_inode_data
f2fs: code cleanup by removing ifdef macro surrounding
f2fs: avoid inifinite loop to wait for flushing node pages at cp_error
f2fs: flush dirty meta pages when flushing them
f2fs: fix checkpoint=disable:%u%%
f2fs: compress: fix zstd data corruption
f2fs: add compressed/gc data read IO stat
f2fs: fix potential use-after-free issue
f2fs: compress: don't handle non-compressed data in workqueue
...

Linus Torvalds
2020-06-10 02:28:59 +0800

09 Jun, 2020

1 commit

0b6d4ca04 f2fs: don't return vmalloc() memory from f2fs_kmalloc() ... Browse Code »

kmalloc() returns kmalloc'ed memory, and kvmalloc() returns either
kmalloc'ed or vmalloc'ed memory. But the f2fs wrappers, f2fs_kmalloc()
and f2fs_kvmalloc(), both return both kinds of memory.

It's redundant to have two functions that do the same thing, and also
breaking the standard naming convention is causing bugs since people
assume it's safe to kfree() memory allocated by f2fs_kmalloc(). See
e.g. the various allocations in fs/f2fs/compress.c.

Fix this by making f2fs_kmalloc() just use kmalloc(). And to avoid
re-introducing the allocation failures that the vmalloc fallback was
intended to fix, convert the largest allocations to use f2fs_kvmalloc().

Signed-off-by: Eric Biggers
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Eric Biggers
2020-06-09 11:34:58 +0800

19 May, 2020

2 commits

ed318a6cc fscrypt: support test_dummy_encryption=v2 ... Browse Code »

v1 encryption policies are deprecated in favor of v2, and some new
features (e.g. encryption+casefolding) are only being added for v2.

Therefore, the "test_dummy_encryption" mount option (which is used for
encryption I/O testing with xfstests) needs to support v2 policies.

To do this, extend its syntax to be "test_dummy_encryption=v1" or
"test_dummy_encryption=v2". The existing "test_dummy_encryption" (no
argument) also continues to be accepted, to specify the default setting
-- currently v1, but the next patch changes it to v2.

To cleanly support both v1 and v2 while also making it easy to support
specifying other encryption settings in the future (say, accepting
"$contents_mode:$filenames_mode:v2"), make ext4 and f2fs maintain a
pointer to the dummy fscrypt_context rather than using mount flags.

To avoid concurrency issues, don't allow test_dummy_encryption to be set
or changed during a remount. (The former restriction is new, but
xfstests doesn't run into it, so no one should notice.)

Tested with 'gce-xfstests -c {ext4,f2fs}/encrypt -g auto'. On ext4,
there are two regressions, both of which are test bugs: ext4/023 and
ext4/028 fail because they set an xattr and expect it to be stored
inline, but the increase in size of the fscrypt_context from
24 to 40 bytes causes this xattr to be spilled into an external block.

Link: https://lore.kernel.org/r/20200512233251.118314-4-ebiggers@kernel.org
Acked-by: Jaegeuk Kim
Reviewed-by: Theodore Ts'o
Signed-off-by: Eric Biggers

Eric Biggers
2020-05-19 11:21:48 +0800
1ae18f71c f2fs: fix checkpoint=disable:%u%% ... Browse Code »

When parsing the mount option, we don't have sbi->user_block_count.
Should do it after getting it.

Cc:
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2020-05-19 01:47:24 +0800

12 May, 2020

4 commits

b4b10061e f2fs: refactor resize_fs to avoid meta updates in progress ... Browse Code »

Sahitya raised an issue:
- prevent meta updates while checkpoint is in progress

allocate_segment_for_resize() can cause metapage updates if
it requires to change the current node/data segments for resizing.
Stop these meta updates when there is a checkpoint already
in progress to prevent inconsistent CP data.

Signed-off-by: Sahitya Tummala
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2020-05-12 11:37:13 +0800
baaa7ebf2 f2fs: report delalloc reserve as non-free in statfs for project quota ... Browse Code »

This reserved space isn't committed yet but cannot be used for
allocations. For userspace it has no difference from used space.

See the same fix in ext4 commit f06925c73942 ("ext4: report delalloc
reserve as non-free in statfs for project quota").

Fixes: ddc34e328d06 ("f2fs: introduce f2fs_statfs_project")
Signed-off-by: Konstantin Khlebnikov
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Konstantin Khlebnikov
2020-05-12 11:36:47 +0800
6d92b2010 f2fs: compress: support lzo-rle compress algorithm ... Browse Code »

LZO-RLE extension (run length encoding) was introduced to improve
performance of LZO algorithm in scenario of data contains many zeros,
zram has changed to use this extended algorithm by default, this
patch adds to support this algorithm extension, to enable this
extension, it needs to enable F2FS_FS_LZO and F2FS_FS_LZORLE config,
and specifies "compress_algorithm=lzo-rle" mountoption.

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2020-05-12 11:36:46 +0800
5e6bbde95 f2fs: introduce mempool for {,de}compress intermediate page allocation ... Browse Code »

If compression feature is on, in scenario of no enough free memory,
page refault ratio is higher than before, the root cause is:
- {,de}compression flow needs to allocate intermediate pages to store
compressed data in cluster, so during their allocation, vm may reclaim
mmaped pages.
- if above reclaimed pages belong to compressed cluster, during its
refault, it may cause more intermediate pages allocation, result in
reclaiming more mmaped pages.

So this patch introduces a mempool for intermediate page allocation,
in order to avoid high refault ratio, by default, number of
preallocated page in pool is 512, user can change the number by
assigning 'num_compress_pages' parameter during module initialization.

Ma Feng found warnings in the original patch and fixed like below.

Fix the following sparse warning:
fs/f2fs/compress.c:501:5: warning: symbol 'num_compress_pages' was not declared.
Should it be static?
fs/f2fs/compress.c:530:6: warning: symbol 'f2fs_compress_free_page' was not
declared. Should it be static?

Reported-by: Hulk Robot
Signed-off-by: Chao Yu
Signed-off-by: Ma Feng
Signed-off-by: Jaegeuk Kim

Chao Yu
2020-05-12 11:35:51 +0800

08 May, 2020

1 commit

3c57f7518 f2fs: use strcmp() in parse_options() ... Browse Code »

Remove the pointless string length checks. Just use strcmp().

Signed-off-by: Eric Biggers
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Eric Biggers
2020-05-08 21:55:56 +0800

18 Apr, 2020

1 commit

2bc4bea33 f2fs: add tracepoint for f2fs iostat ... Browse Code »

Added a tracepoint to see iostat of f2fs. Default period of that
is 3 second. This tracepoint can be used to be monitoring
I/O statistics periodically.

Signed-off-by: Daeho Jeong
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Daeho Jeong
2020-04-18 00:16:57 +0800

08 Apr, 2020

1 commit

f40f31cad Merge tag 'f2fs-for-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs ... Browse Code »

Pull f2fs updates from Jaegeuk Kim:
"In this round, we've mainly focused on fixing bugs and addressing
issues in recently introduced compression support.

Enhancement:
- add zstd support, and set LZ4 by default
- add ioctl() to show # of compressed blocks
- show mount time in debugfs
- replace rwsem with spinlock
- avoid lock contention in DIO reads

Some major bug fixes wrt compression:
- compressed block count
- memory access and leak
- remove obsolete fields
- flag controls

Other bug fixes and clean ups:
- fix overflow when handling .flags in inode_info
- fix SPO issue during resize FS flow
- fix compression with fsverity enabled
- potential deadlock when writing compressed pages
- show missing mount options"

* tag 'f2fs-for-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (66 commits)
f2fs: keep inline_data when compression conversion
f2fs: fix to disable compression on directory
f2fs: add missing CONFIG_F2FS_FS_COMPRESSION
f2fs: switch discard_policy.timeout to bool type
f2fs: fix to verify tpage before releasing in f2fs_free_dic()
f2fs: show compression in statx
f2fs: clean up dic->tpages assignment
f2fs: compress: support zstd compress algorithm
f2fs: compress: add .{init,destroy}_decompress_ctx callback
f2fs: compress: fix to call missing destroy_compress_ctx()
f2fs: change default compression algorithm
f2fs: clean up {cic,dic}.ref handling
f2fs: fix to use f2fs_readpage_limit() in f2fs_read_multi_pages()
f2fs: xattr.h: Make stub helpers inline
f2fs: fix to avoid double unlock
f2fs: fix potential .flags overflow on 32bit architecture
f2fs: fix NULL pointer dereference in f2fs_verity_work()
f2fs: fix to clear PG_error if fsverity failed
f2fs: don't call fscrypt_get_encryption_info() explicitly in f2fs_tmpfile()
f2fs: don't trigger data flush in foreground operation
...

Linus Torvalds
2020-04-08 04:48:26 +0800

04 Apr, 2020

1 commit

50cfa66f0 f2fs: compress: support zstd compress algorithm ... Browse Code »

Add zstd compress algorithm support, use "compress_algorithm=zstd"
mountoption to enable it.

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2020-04-04 01:21:10 +0800

31 Mar, 2020

1 commit

91faa5344 f2fs: change default compression algorithm ... Browse Code »

Use LZ4 as default compression algorithm, as compared to LZO, it shows
almost the same compression ratio and much better decompression speed.

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2020-03-31 11:46:26 +0800