12 Jan, 2016

1 commit


31 Dec, 2015

1 commit

  • Sometimes we keep dumb when IO error occur in lower layer device, so user
    will not receive any error return value for some operation, but actually,
    the operation did not succeed.

    This sould be avoided, so this patch reports such kind of error to user.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     

05 Dec, 2015

1 commit


14 Oct, 2015

2 commits

  • Once f2fs_gc is done, wait_ms is changed once more.
    So, its tracepoint would be located after it.

    Reported-by: He YunLei
    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • different competitors

    Since we use different page cache (normally inode's page cache for R/W
    and meta inode's page cache for GC) to cache the same physical block
    which is belong to an encrypted inode. Writeback of these two page
    cache should be exclusive, but now we didn't handle writeback state
    well, so there may be potential racing problem:

    a)
    kworker: f2fs_gc:
    - f2fs_write_data_pages
    - f2fs_write_data_page
    - do_write_data_page
    - write_data_page
    - f2fs_submit_page_mbio
    (page#1 in inode's page cache was queued
    in f2fs bio cache, and be ready to write
    to new blkaddr)
    - gc_data_segment
    - move_encrypted_block
    - pagecache_get_page
    (page#2 in meta inode's page cache
    was cached with the invalid datas
    of physical block located in new
    blkaddr)
    - f2fs_submit_page_mbio
    (page#1 was submitted, later, page#2
    with invalid data will be submitted)

    b)
    f2fs_gc:
    - gc_data_segment
    - move_encrypted_block
    - f2fs_submit_page_mbio
    (page#1 in meta inode's page cache was
    queued in f2fs bio cache, and be ready
    to write to new blkaddr)
    user thread:
    - f2fs_write_begin
    - f2fs_submit_page_bio
    (we submit the request to block layer
    to update page#2 in inode's page cache
    with physical block located in new
    blkaddr, so here we may read gabbage
    data from new blkaddr since GC hasn't
    writebacked the page#1 yet)

    This patch fixes above potential racing problem for encrypted inode.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     

13 Oct, 2015

2 commits

  • Now, we use ra_meta_pages to reads continuous physical blocks as much as
    possible to improve performance of following reads. However, ra_meta_pages
    uses a synchronous readahead approach by submitting bio with READ, as READ
    is with high priority, it can not be used in the case of preloading blocks,
    and it's not sure when these RAed pages will be used.

    This patch supports asynchronous readahead in ra_meta_pages by tagging bio
    with READA flag in order to allow preloading.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • For normal inodes, their pages are allocated with __GFP_FS, which can cause
    filesystem calls when reclaiming memory.
    This can incur a dead lock condition accordingly.

    So, this patch addresses this problem by introducing
    f2fs_grab_cache_page(.., bool for_write), which calls
    grab_cache_page_write_begin() with AOP_FLAG_NOFS.

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     

10 Oct, 2015

7 commits

  • This patch introduces a tracepoint to monitor background gc behaviors.

    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • This patch introduce background_gc=sync enabling synchronous cleaning in
    background.

    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • This patch drops in batches gc triggered through ioctl, since user
    can easily control the gc by designing the loop around the ->ioctl.

    We support synchronous gc by forcing using FG_GC in f2fs_gc, so with
    it, user can make sure that in this round all blocks gced were
    persistent in the device until ioctl returned.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • When searching victim during gc, if there are no dirty segments in
    filesystem, we will still take the time to search the whole dirty segment
    map, it's not needed, it's better to skip in this condition.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • When doing gc, we search a victim in dirty map, starting from position of
    last victim, we will reset the current searching position until we touch
    the end of dirty map, and then search the whole diryt map. So sometimes we
    will search the range [victim, last] twice, it's redundant, this patch
    avoids this issue.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • If we do not call get_victim first, we cannot get a new victim for retrial
    path.

    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • This patch fixes to maintain the right section count freed in garbage
    collecting when triggering a foreground gc.

    Besides, when a foreground gc is running on current selected section, once
    we fail to gc one segment, it's better to abandon gcing the left segments
    in current section, because anyway we will select next victim for
    foreground gc, so gc on the left segments in previous section will become
    overhead and also cause the long latency for caller.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     

21 Aug, 2015

3 commits


05 Aug, 2015

2 commits


25 Jul, 2015

2 commits

  • The cgroup attaches inode->i_wb via mark_inode_dirty and when set_page_writeback
    is called, __inc_wb_stat() updates i_wb's stat.

    So, we need to explicitly call set_page_dirty->__mark_inode_dirty in prior to
    any writebacking pages.

    This patch should resolve the following kernel panic reported by Andreas Reis.

    https://bugzilla.kernel.org/show_bug.cgi?id=101801

    --- Comment #2 from Andreas Reis ---
    BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
    IP: [] __percpu_counter_add+0x1a/0x90
    PGD 2951ff067 PUD 2df43f067 PMD 0
    Oops: 0000 [#1] PREEMPT SMP
    Modules linked in:
    CPU: 7 PID: 10356 Comm: gcc Tainted: G W 4.2.0-1-cu #1
    Hardware name: Gigabyte Technology Co., Ltd. G1.Sniper M5/G1.Sniper M5, BIOS
    T01 02/03/2015
    task: ffff880295044f80 ti: ffff880295140000 task.ti: ffff880295140000
    RIP: 0010:[] []
    __percpu_counter_add+0x1a/0x90
    RSP: 0018:ffff880295143ac8 EFLAGS: 00010082
    RAX: 0000000000000003 RBX: ffffea000a526d40 RCX: 0000000000000001
    RDX: 0000000000000020 RSI: 0000000000000001 RDI: 0000000000000088
    RBP: ffff880295143ae8 R08: 0000000000000000 R09: ffff88008f69bb30
    R10: 00000000fffffffa R11: 0000000000000000 R12: 0000000000000088
    R13: 0000000000000001 R14: ffff88041d099000 R15: ffff880084a205d0
    FS: 00007f8549374700(0000) GS:ffff88042f3c0000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000000000a8 CR3: 000000033e1d5000 CR4: 00000000001406e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Stack:
    0000000000000000 ffffea000a526d40 ffff880084a20738 ffff880084a20750
    ffff880295143b48 ffffffff811cc91e ffff880000000000 0000000000000296
    0000000000000000 ffff880417090198 0000000000000000 ffffea000a526d40
    Call Trace:
    [] __test_set_page_writeback+0xde/0x1d0
    [] do_write_data_page+0xe7/0x3a0
    [] gc_data_segment+0x5aa/0x640
    [] do_garbage_collect+0x138/0x150
    [] f2fs_gc+0x1be/0x3e0
    [] f2fs_balance_fs+0x81/0x90
    [] f2fs_unlink+0x47/0x1d0
    [] vfs_unlink+0x109/0x1b0
    [] do_unlinkat+0x287/0x2c0
    [] SyS_unlink+0x16/0x20
    [] entry_SYSCALL_64_fastpath+0x12/0x71
    Code: 41 5e 5d c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 55 49
    89 f5 41 54 49 89 fc 53 48 83 ec 08 65 ff 05 e6 d9 b6 7e 8b 47 20 48 63 ca
    65 8b 18 48 63 db 48 01 f3 48 39 cb 7d 0a
    RIP [] __percpu_counter_add+0x1a/0x90
    RSP
    CR2: 00000000000000a8
    ---[ end trace 5132449a58ed93a3 ]---
    note: gcc[10356] exited with preempt_count 2

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • This patch fixes some missing error handlers.

    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     

02 Jun, 2015

1 commit


29 May, 2015

4 commits


11 Apr, 2015

1 commit


12 Feb, 2015

3 commits


10 Jan, 2015

1 commit

  • There are two slab cache inode_entry_slab and winode_slab using the same
    structure as below:

    struct dir_inode_entry {
    struct list_head list; /* list head */
    struct inode *inode; /* vfs inode pointer */
    };

    struct inode_entry {
    struct list_head list;
    struct inode *inode;
    };

    It's a little waste that the two cache can not share their memory space for each
    other.
    So in this patch we remove one redundant winode_slab slab cache, then use more
    universal name struct inode_entry as remaining data structure name of slab,
    finally we reuse the inode_entry_slab to store dirty dir item and gc item for
    more effective.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     

09 Dec, 2014

1 commit


06 Dec, 2014

1 commit

  • This patch tries to fix:

    BUG: using smp_processor_id() in preemptible [00000000] code: f2fs_gc-254:0/384
    (radix_tree_node_alloc+0x14/0x74) from [] (radix_tree_insert+0x110/0x200)
    (radix_tree_insert+0x110/0x200) from [] (gc_data_segment+0x340/0x52c)
    (gc_data_segment+0x340/0x52c) from [] (f2fs_gc+0x208/0x400)
    (f2fs_gc+0x208/0x400) from [] (gc_thread_func+0x248/0x28c)
    (gc_thread_func+0x248/0x28c) from [] (kthread+0xa0/0xac)
    (kthread+0xa0/0xac) from [] (ret_from_fork+0x14/0x3c)

    The reason is that f2fs calls radix_tree_insert under enabled preemption.
    So, before calling it, we need to call radix_tree_preload.

    Otherwise, we should use _GFP_WAIT for the radix tree, and use mutex or
    semaphore to cover the radix tree operations.

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     

03 Dec, 2014

1 commit


28 Nov, 2014

1 commit


20 Nov, 2014

1 commit

  • In f2fs_remount, we will stop gc thread and set need_restart_gc as true when new
    option is set without BG_GC, then if any error occurred in the following
    procedure, we can restore to start the gc thread.
    But after that, We will fail to restore gc thread in start_gc_thread as BG_GC is
    not set in new option, so we'd better move this condition judgment out of
    start_gc_thread to fix this issue.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     

05 Nov, 2014

1 commit


04 Nov, 2014

1 commit


01 Oct, 2014

2 commits

  • This patch cleans up the existing and new macros for readability.

    Rule is like this.

    ,-----------------------------------------> MAX_BLKADDR -,
    | ,------------- TOTAL_BLKS ----------------------------,
    | | |
    | ,- seg0_blkaddr ,----- sit/nat/ssa/main blkaddress |
    block | | (SEG0_BLKADDR) | | | | (e.g., MAIN_BLKADDR) |
    address 0..x................ a b c d .............................
    | |
    global seg# 0...................... m .............................
    | | |
    | `------- MAIN_SEGS -----------'
    `-------------- TOTAL_SEGS ---------------------------'
    | |
    seg# 0..........xx..................

    = Note =
    o GET_SEGNO_FROM_SEG0 : blk address -> global segno
    o GET_SEGNO : blk address -> segno
    o START_BLOCK : segno -> starting block address

    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • This patch add a new data structure to control checkpoint parameters.
    Currently, it presents the reason of checkpoint such as is_umount and normal
    sync.

    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim