12 Jan, 2016
1 commit
-
This patch adds last time that user requested filesystem operations.
This information is used to detect whether system is idle or not later.Signed-off-by: Jaegeuk Kim
31 Dec, 2015
1 commit
-
Sometimes we keep dumb when IO error occur in lower layer device, so user
will not receive any error return value for some operation, but actually,
the operation did not succeed.This sould be avoided, so this patch reports such kind of error to user.
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
05 Dec, 2015
1 commit
-
Use sbi->blocks_per_seg directly to avoid unnecessary calculation when using
1 << sbi->log_blocks_per_seg.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
14 Oct, 2015
2 commits
-
Once f2fs_gc is done, wait_ms is changed once more.
So, its tracepoint would be located after it.Reported-by: He YunLei
Signed-off-by: Jaegeuk Kim -
different competitors
Since we use different page cache (normally inode's page cache for R/W
and meta inode's page cache for GC) to cache the same physical block
which is belong to an encrypted inode. Writeback of these two page
cache should be exclusive, but now we didn't handle writeback state
well, so there may be potential racing problem:a)
kworker: f2fs_gc:
- f2fs_write_data_pages
- f2fs_write_data_page
- do_write_data_page
- write_data_page
- f2fs_submit_page_mbio
(page#1 in inode's page cache was queued
in f2fs bio cache, and be ready to write
to new blkaddr)
- gc_data_segment
- move_encrypted_block
- pagecache_get_page
(page#2 in meta inode's page cache
was cached with the invalid datas
of physical block located in new
blkaddr)
- f2fs_submit_page_mbio
(page#1 was submitted, later, page#2
with invalid data will be submitted)b)
f2fs_gc:
- gc_data_segment
- move_encrypted_block
- f2fs_submit_page_mbio
(page#1 in meta inode's page cache was
queued in f2fs bio cache, and be ready
to write to new blkaddr)
user thread:
- f2fs_write_begin
- f2fs_submit_page_bio
(we submit the request to block layer
to update page#2 in inode's page cache
with physical block located in new
blkaddr, so here we may read gabbage
data from new blkaddr since GC hasn't
writebacked the page#1 yet)This patch fixes above potential racing problem for encrypted inode.
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
13 Oct, 2015
2 commits
-
Now, we use ra_meta_pages to reads continuous physical blocks as much as
possible to improve performance of following reads. However, ra_meta_pages
uses a synchronous readahead approach by submitting bio with READ, as READ
is with high priority, it can not be used in the case of preloading blocks,
and it's not sure when these RAed pages will be used.This patch supports asynchronous readahead in ra_meta_pages by tagging bio
with READA flag in order to allow preloading.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
For normal inodes, their pages are allocated with __GFP_FS, which can cause
filesystem calls when reclaiming memory.
This can incur a dead lock condition accordingly.So, this patch addresses this problem by introducing
f2fs_grab_cache_page(.., bool for_write), which calls
grab_cache_page_write_begin() with AOP_FLAG_NOFS.Signed-off-by: Jaegeuk Kim
10 Oct, 2015
7 commits
-
This patch introduces a tracepoint to monitor background gc behaviors.
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
This patch introduce background_gc=sync enabling synchronous cleaning in
background.Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
This patch drops in batches gc triggered through ioctl, since user
can easily control the gc by designing the loop around the ->ioctl.We support synchronous gc by forcing using FG_GC in f2fs_gc, so with
it, user can make sure that in this round all blocks gced were
persistent in the device until ioctl returned.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
When searching victim during gc, if there are no dirty segments in
filesystem, we will still take the time to search the whole dirty segment
map, it's not needed, it's better to skip in this condition.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
When doing gc, we search a victim in dirty map, starting from position of
last victim, we will reset the current searching position until we touch
the end of dirty map, and then search the whole diryt map. So sometimes we
will search the range [victim, last] twice, it's redundant, this patch
avoids this issue.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
If we do not call get_victim first, we cannot get a new victim for retrial
path.Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
This patch fixes to maintain the right section count freed in garbage
collecting when triggering a foreground gc.Besides, when a foreground gc is running on current selected section, once
we fail to gc one segment, it's better to abandon gcing the left segments
in current section, because anyway we will select next victim for
foreground gc, so gc on the left segments in previous section will become
overhead and also cause the long latency for caller.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
21 Aug, 2015
3 commits
-
If FG_GC failed to reclaim one section, let's retry with another section
from the start, since we can get anoterh good candidate.Signed-off-by: Jaegeuk Kim
-
If node blocks were already moved, we don't need to move them again.
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
We should avoid needless checkpoints when there is no dirty and prefree segment.
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim
05 Aug, 2015
2 commits
-
That encrypted page is used temporarily, so we don't need to mark it accessed.
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
This makes the function check_dnode have a return type of bool
due to this particular function only ever returning either one
or zero as its return value and changes the name of the function
to is_alive in order to better explain this function's intended
work of checking if a dnode is still in use by the filesystem.Signed-off-by: Nicholas Krause
[Jaegeuk Kim: change the return value check for the renamed function]
Signed-off-by: Jaegeuk Kim
25 Jul, 2015
2 commits
-
The cgroup attaches inode->i_wb via mark_inode_dirty and when set_page_writeback
is called, __inc_wb_stat() updates i_wb's stat.So, we need to explicitly call set_page_dirty->__mark_inode_dirty in prior to
any writebacking pages.This patch should resolve the following kernel panic reported by Andreas Reis.
https://bugzilla.kernel.org/show_bug.cgi?id=101801
--- Comment #2 from Andreas Reis ---
BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
IP: [] __percpu_counter_add+0x1a/0x90
PGD 2951ff067 PUD 2df43f067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in:
CPU: 7 PID: 10356 Comm: gcc Tainted: G W 4.2.0-1-cu #1
Hardware name: Gigabyte Technology Co., Ltd. G1.Sniper M5/G1.Sniper M5, BIOS
T01 02/03/2015
task: ffff880295044f80 ti: ffff880295140000 task.ti: ffff880295140000
RIP: 0010:[] []
__percpu_counter_add+0x1a/0x90
RSP: 0018:ffff880295143ac8 EFLAGS: 00010082
RAX: 0000000000000003 RBX: ffffea000a526d40 RCX: 0000000000000001
RDX: 0000000000000020 RSI: 0000000000000001 RDI: 0000000000000088
RBP: ffff880295143ae8 R08: 0000000000000000 R09: ffff88008f69bb30
R10: 00000000fffffffa R11: 0000000000000000 R12: 0000000000000088
R13: 0000000000000001 R14: ffff88041d099000 R15: ffff880084a205d0
FS: 00007f8549374700(0000) GS:ffff88042f3c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000a8 CR3: 000000033e1d5000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
0000000000000000 ffffea000a526d40 ffff880084a20738 ffff880084a20750
ffff880295143b48 ffffffff811cc91e ffff880000000000 0000000000000296
0000000000000000 ffff880417090198 0000000000000000 ffffea000a526d40
Call Trace:
[] __test_set_page_writeback+0xde/0x1d0
[] do_write_data_page+0xe7/0x3a0
[] gc_data_segment+0x5aa/0x640
[] do_garbage_collect+0x138/0x150
[] f2fs_gc+0x1be/0x3e0
[] f2fs_balance_fs+0x81/0x90
[] f2fs_unlink+0x47/0x1d0
[] vfs_unlink+0x109/0x1b0
[] do_unlinkat+0x287/0x2c0
[] SyS_unlink+0x16/0x20
[] entry_SYSCALL_64_fastpath+0x12/0x71
Code: 41 5e 5d c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 55 49
89 f5 41 54 49 89 fc 53 48 83 ec 08 65 ff 05 e6 d9 b6 7e 8b 47 20 48 63 ca
65 8b 18 48 63 db 48 01 f3 48 39 cb 7d 0a
RIP [] __percpu_counter_add+0x1a/0x90
RSP
CR2: 00000000000000a8
---[ end trace 5132449a58ed93a3 ]---
note: gcc[10356] exited with preempt_count 2Signed-off-by: Jaegeuk Kim
-
This patch fixes some missing error handlers.
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim
02 Jun, 2015
1 commit
-
In f2fs_gc: In f2fs_replace_block:
- lock_page(sum_page)
- check_valid_map() - mutex_lock(sentry_lock)
- mutex_lock(sentry_lock) - change_curseg()
- lock_page(sum_page)This patch fixes the deadlock condition.
Signed-off-by: Jaegeuk Kim
29 May, 2015
4 commits
-
This patch adds encryption support in read and write paths.
Note that, in f2fs, we need to consider cleaning operation.
In cleaning procedure, we must avoid encrypting and decrypting written blocks.
So, this patch implements move_encrypted_block().Signed-off-by: Jaegeuk Kim
-
This patch splits find_data_page as follows.
1. f2fs_gc
- use get_read_data_page() with read only2. find_in_level
- use find_data_page without locked page3. truncate_partial_page
- In the case cache_only mode, just drop cached page.
- Ohterwise, use get_lock_data_page() and guarantee to truncateSigned-off-by: Jaegeuk Kim
-
This patch moves getting victim page into move_data_page.
Signed-off-by: Jaegeuk Kim
-
This patch adds f2fs_sb_info and page pointers in f2fs_io_info structure.
With this change, we can reduce a lot of parameters for IO functions.Signed-off-by: Jaegeuk Kim
11 Apr, 2015
1 commit
-
This patch is for looking into gc performance of f2fs in detail.
Signed-off-by: Changman Lee
[Jaegeuk Kim: fix build errors]
Signed-off-by: Jaegeuk Kim
12 Feb, 2015
3 commits
-
This patch adds FASTBOOT flag into checkpoint as follows.
- CP_UMOUNT_FLAG is set when system is umounted.
- CP_FASTBOOT_FLAG is set when intermediate checkpoint having node summaries
was done.So, if you get CP_UMOUNT_FLAG from checkpoint, the system was umounted cleanly.
Instead, if there was sudden-power-off, you can get CP_FASTBOOT_FLAG or nothing.Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
Use pointer parameter @wait to pass result in {in,de}create_sleep_time for
cleanup.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
No modification in functionality, just clean codes with f2fs_radix_tree_insert.
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
10 Jan, 2015
1 commit
-
There are two slab cache inode_entry_slab and winode_slab using the same
structure as below:struct dir_inode_entry {
struct list_head list; /* list head */
struct inode *inode; /* vfs inode pointer */
};struct inode_entry {
struct list_head list;
struct inode *inode;
};It's a little waste that the two cache can not share their memory space for each
other.
So in this patch we remove one redundant winode_slab slab cache, then use more
universal name struct inode_entry as remaining data structure name of slab,
finally we reuse the inode_entry_slab to store dirty dir item and gc item for
more effective.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
09 Dec, 2014
1 commit
-
This patch revists retrial paths in f2fs.
The basic idea is to use cond_resched instead of retrying from the very early
stage.Suggested-by: Gu Zheng
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim
06 Dec, 2014
1 commit
-
This patch tries to fix:
BUG: using smp_processor_id() in preemptible [00000000] code: f2fs_gc-254:0/384
(radix_tree_node_alloc+0x14/0x74) from [] (radix_tree_insert+0x110/0x200)
(radix_tree_insert+0x110/0x200) from [] (gc_data_segment+0x340/0x52c)
(gc_data_segment+0x340/0x52c) from [] (f2fs_gc+0x208/0x400)
(f2fs_gc+0x208/0x400) from [] (gc_thread_func+0x248/0x28c)
(gc_thread_func+0x248/0x28c) from [] (kthread+0xa0/0xac)
(kthread+0xa0/0xac) from [] (ret_from_fork+0x14/0x3c)The reason is that f2fs calls radix_tree_insert under enabled preemption.
So, before calling it, we need to call radix_tree_preload.Otherwise, we should use _GFP_WAIT for the radix tree, and use mutex or
semaphore to cover the radix tree operations.Signed-off-by: Jaegeuk Kim
03 Dec, 2014
1 commit
-
If there are many inodes that have data blocks in victim segment,
it takes long time to find a inode in gc_inode list.
Let's use radix_tree to reduce lookup time.Signed-off-by: Changman Lee
Signed-off-by: Jaegeuk Kim
28 Nov, 2014
1 commit
-
Little cleanup to distinguish each phase easily
Signed-off-by: Changman Lee
[Jaegeuk Kim: modify indentation for code readability]
Signed-off-by: Jaegeuk Kim
20 Nov, 2014
1 commit
-
In f2fs_remount, we will stop gc thread and set need_restart_gc as true when new
option is set without BG_GC, then if any error occurred in the following
procedure, we can restore to start the gc thread.
But after that, We will fail to restore gc thread in start_gc_thread as BG_GC is
not set in new option, so we'd better move this condition judgment out of
start_gc_thread to fix this issue.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
05 Nov, 2014
1 commit
-
If a system wants to reduce the booting time as a top priority, now we can
use a mount option, -o fastboot.
With this option, f2fs conducts a little bit slow write_checkpoint, but
it can avoid the node page reads during the next mount time.Signed-off-by: Jaegeuk Kim
04 Nov, 2014
1 commit
-
Remove the unneeded argument 'type' from __get_victim, use
NO_CHECK_TYPE directly when calling v_ops->get_victim().Signed-off-by: Gu Zheng
Signed-off-by: Jaegeuk Kim
01 Oct, 2014
2 commits
-
This patch cleans up the existing and new macros for readability.
Rule is like this.
,-----------------------------------------> MAX_BLKADDR -,
| ,------------- TOTAL_BLKS ----------------------------,
| | |
| ,- seg0_blkaddr ,----- sit/nat/ssa/main blkaddress |
block | | (SEG0_BLKADDR) | | | | (e.g., MAIN_BLKADDR) |
address 0..x................ a b c d .............................
| |
global seg# 0...................... m .............................
| | |
| `------- MAIN_SEGS -----------'
`-------------- TOTAL_SEGS ---------------------------'
| |
seg# 0..........xx..................= Note =
o GET_SEGNO_FROM_SEG0 : blk address -> global segno
o GET_SEGNO : blk address -> segno
o START_BLOCK : segno -> starting block addressReviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
This patch add a new data structure to control checkpoint parameters.
Currently, it presents the reason of checkpoint such as is_umount and normal
sync.Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim