09 Dec, 2014
11 commits
-
To improve recovery speed, f2fs try to readahead many contiguous blocks in warm
node segment, but for most time, abnormal power-off do not occur frequently, so
when mount a normal power-off f2fs image, by contrary ra so many blocks and then
invalid them will hurt the performance of mount.
It's better to just ra the first next-block for normal condition.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
This patch does cleanup work, it introduces is_valid_blkaddr() to include
verification code for blkaddr with upper and down boundary value which were in
ra_meta_pages previous.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
1.We use zero as upper boundary value for ra SSA/CP blocks, we will skip
readahead as verification failure with max number, it causes low performance.
2.Low boundary value is not accurate for SSA/CP/POR region verification, so
these values need to be redefined.This patch fixes above issues.
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
As inline_{dir,inode} stat is increased/decreased concurrently by multi threads,
so the value is not so accurate, let's use atomic type for counting accurately.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
Added some commentaries for code readability and cleaned up if-statement
clearly.Signed-off-by: Changman Lee
Signed-off-by: Jaegeuk Kim -
If inode state is dirty, go straight to write.
Suggested-by: Jaegeuk Kim
Signed-off-by: Changman Lee
Signed-off-by: Jaegeuk Kim -
This patch adds counting # of inmemory pages in the page cache.
Signed-off-by: Jaegeuk Kim
-
If file is closed, let's drop inmemory pages.
Signed-off-by: Jaegeuk Kim
-
The inmemory pages should be handled by invalidate_page since it needs to be
released int the truncation path.Signed-off-by: Jaegeuk Kim
-
In do_read_inode, if we failed __recover_inline_status, the inode has inline
flag without increasing its count.
Later, f2fs_evict_inode will decrease the count, which causes -1.Signed-off-by: Jaegeuk Kim
-
This patch revists retrial paths in f2fs.
The basic idea is to use cond_resched instead of retrying from the very early
stage.Suggested-by: Gu Zheng
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim
06 Dec, 2014
1 commit
-
This patch tries to fix:
BUG: using smp_processor_id() in preemptible [00000000] code: f2fs_gc-254:0/384
(radix_tree_node_alloc+0x14/0x74) from [] (radix_tree_insert+0x110/0x200)
(radix_tree_insert+0x110/0x200) from [] (gc_data_segment+0x340/0x52c)
(gc_data_segment+0x340/0x52c) from [] (f2fs_gc+0x208/0x400)
(f2fs_gc+0x208/0x400) from [] (gc_thread_func+0x248/0x28c)
(gc_thread_func+0x248/0x28c) from [] (kthread+0xa0/0xac)
(kthread+0xa0/0xac) from [] (ret_from_fork+0x14/0x3c)The reason is that f2fs calls radix_tree_insert under enabled preemption.
So, before calling it, we need to call radix_tree_preload.Otherwise, we should use _GFP_WAIT for the radix tree, and use mutex or
semaphore to cover the radix tree operations.Signed-off-by: Jaegeuk Kim
04 Dec, 2014
2 commits
-
Previoulsy, we used rwlock for nat_entry lock.
But, now we have a lot of complex operations in set_node_addr.
(e.g., allocating kernel memories, handling radix_trees, and so on)So, this patches tries to change spinlock to rw_semaphore to give CPUs to other
threads.Signed-off-by: Jaegeuk Kim
-
This patch fixes missing kmem_cache_free when handling errors.
Signed-off-by: Jaegeuk Kim
03 Dec, 2014
1 commit
-
If there are many inodes that have data blocks in victim segment,
it takes long time to find a inode in gc_inode list.
Let's use radix_tree to reduce lookup time.Signed-off-by: Changman Lee
Signed-off-by: Jaegeuk Kim
02 Dec, 2014
2 commits
-
We've already made fi and sbi for inode. Let's avoid duplicated work.
Signed-off-by: Changman Lee
Signed-off-by: Jaegeuk Kim -
Fix the wrong error number in error path of f2fs_write_begin.
Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
28 Nov, 2014
1 commit
-
Little cleanup to distinguish each phase easily
Signed-off-by: Changman Lee
[Jaegeuk Kim: modify indentation for code readability]
Signed-off-by: Jaegeuk Kim
26 Nov, 2014
6 commits
-
If an inode has converted inline_data which was written to the disk, we should
set its inode flag for further fsync so that this inline_data can be recovered
from sudden power off.Signed-off-by: Jaegeuk Kim
-
If a page is set to be written to the disk, we can make clean the page.
Signed-off-by: Jaegeuk Kim
-
After flushing dirty nat entries, it has to be no more dirty nat
entries.Signed-off-by: Changman Lee
Signed-off-by: Jaegeuk Kim -
It's meaningless to check dirty_nat_cnt after re-dirtying nat entries in
journal. And although there are rooms for dirty nat entires if dirty_nat_cnt
is zero, it's also meaningless to check __has_cursum_space.Signed-off-by: Changman Lee
Signed-off-by: Jaegeuk Kim -
A deadlock can be occurred:
Thread 1] Thread 2]
- f2fs_write_data_pages - f2fs_write_begin
- lock_page(page #0)
- grab_cache_page(page #X)
- get_node_page(inode_page)
- grab_cache_page(page #0)
: to convert inline_data
- f2fs_write_data_page
- f2fs_write_inline_data
- get_node_page(inode_page)In this case, trying to lock inode_page and page #0 causes deadlock.
In order to avoid this, this patch adds a rule for this locking policy,
which is that page #0 should be locked followed by inode_page lock.Signed-off-by: Jaegeuk Kim
-
Two jump labels were adjusted in the implementation of the
create_node_manager_caches() function because these identifiers
contained typos.Signed-off-by: Markus Elfring
Acked-by: Randy Dunlap
Signed-off-by: Jaegeuk Kim
24 Nov, 2014
4 commits
-
In f2fs_evict_inode,
commit_inmemory_pages
f2fs_gc
f2fs_iget
iget_locked
-> wait for inode freeHere, if the inode is same as the one to be evicted, f2fs should wait forever.
Actually, we should not call f2fs_balance_fs during f2fs_evict_inode to avoid
this.But, the commit_inmem_pages calls f2fs_balance_fs by default, even if
f2fs_evict_inode wants to free inmemory pages only.Hence, this patch adds to trigger f2fs_balance_fs only when there is something
to write.Signed-off-by: Jaegeuk Kim
-
This patch introduces f2fs_dentry_kunmap to clean up dirty codes.
Signed-off-by: Jaegeuk Kim
-
It used nat_entry_set when create slab for sit_entry_set.
Signed-off-by: Changman Lee
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
Whenever f2fs updates mapped pages, it needs to call flush_dcache_page.
Signed-off-by: Jaegeuk Kim
20 Nov, 2014
5 commits
-
Under memory pressure, we don't need to skip SSA page writes.
Signed-off-by: Jaegeuk Kim
-
If a node page is request to be written during the reclaiming path, we should
submit the bio to avoid pending to recliam it.Signed-off-by: Jaegeuk Kim
-
Now in f2fs, we have three inode cache: ORPHAN_INO, APPEND_INO, UPDATE_INO,
and we manage fields related to inode cache separately in struct f2fs_sb_info
for each inode cache type.
This makes codes a bit messy, so that this patch intorduce a new struct
inode_management to wrap inner fields as following which make codes more neat./* for inner inode cache management */
struct inode_management {
struct radix_tree_root ino_root; /* ino entry array */
spinlock_t ino_lock; /* for ino entry lock */
struct list_head ino_list; /* inode list head */
unsigned long ino_num; /* number of entries */
};struct f2fs_sb_info {
...
struct inode_management im[MAX_INO_ENTRY]; /* manage inode cache */
...
}Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
Because we have checked the contrary condition in case of "if" judgment, we do
not need to check the condition again in case of "else" judgment. Let's remove
it.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim -
In f2fs_remount, we will stop gc thread and set need_restart_gc as true when new
option is set without BG_GC, then if any error occurred in the following
procedure, we can restore to start the gc thread.
But after that, We will fail to restore gc thread in start_gc_thread as BG_GC is
not set in new option, so we'd better move this condition judgment out of
start_gc_thread to fix this issue.Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim
19 Nov, 2014
2 commits
-
We should put the inode page when error was occurred.
Signed-off-by: Jaegeuk Kim
-
The locked page should be released before returning the function.
Reviewed-by: Chao Yu
Signed-off-by: Jaegeuk Kim
12 Nov, 2014
2 commits
-
If i_size becomes large outside of MAX_INLINE_DATA, we shoud convert the inode.
Otherwise, we can make some dirty pages during the truncation, and those pages
will be written through f2fs_write_data_page.
At that moment, the inode has still inline_data, so that it tries to write non-
zero pages into inline_data area.Signed-off-by: Jaegeuk Kim
-
The scenario is like this.
One trhead triggers:
f2fs_write_data_pages
lock_page
f2fs_write_data_page
f2fs_lock_op
11 Nov, 2014
1 commit
-
The # of inline_data inode is decreased only when it has inline_data.
After clearing the flag, we can't decreased the number.Signed-off-by: Jaegeuk Kim
10 Nov, 2014
2 commits
-
If a mount option has dirsync, we should call checkpoint for all the directory
operations.Signed-off-by: Jaegeuk Kim
-
Under memory pressure, let's avoid skipping data writes.
Signed-off-by: Jaegeuk Kim