16 Jul, 2016

1 commit


09 Jul, 2016

1 commit


03 Jun, 2016

3 commits


08 May, 2016

1 commit


27 Feb, 2016

1 commit


23 Feb, 2016

6 commits

  • There are redundant pointer conversion in following call stack:
    - at position a, inode was been converted to f2fs_file_info.
    - at position b, f2fs_file_info was been converted to inode again.

    - truncate_blocks(inode,..)
    - fi = F2FS_I(inode) ---a
    - ADDRS_PER_PAGE(node_page, fi)
    - addrs_per_inode(fi)
    - inode = &fi->vfs_inode ---b
    - f2fs_has_inline_xattr(inode)
    - fi = F2FS_I(inode)
    - is_inode_flag_set(fi,..)

    In order to avoid unneeded conversion, alter ADDRS_PER_PAGE and
    addrs_per_inode to acept parameter with type of inode pointer.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • It needs to give a chance to be rescheduled while shrinking slab entries.

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • On the worst case, we need to scan the whole radix tree and every rb-tree to
    free the victimed extent_nodes when shrinking.

    Pengyang initially introduced a victim_list to record the victimed extent_nodes,
    and free these extent_nodes by just scanning a list.

    Later, Chao Yu enhances the original patch to improve memory footprint by
    removing victim list.

    The policy of lru list shrinking becomes:
    1) lock lru list's lock
    2) trylock extent tree's lock
    3) remove extent node from lru list
    4) unlock lru list's lock
    5) do shrink
    6) repeat 1) to 5)

    Signed-off-by: Hou Pengyang
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Hou Pengyang
     
  • If en has empty list pointer, it will be freed sooner, so we don't need to
    set cached_en with it.

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • This patch moves extent_node list operations to be handled together with
    its rbtree operations.

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • There are three steps to free an extent node:
    1) list_del_init, 2)__detach_extent_node, 3) kmem_cache_free

    In path f2fs_destroy_extent_tree, 1->2->3 to free a node,
    But in path f2fs_update_extent_tree_range, it is 2->1->3.

    This patch makes all the order to be: 1->2->3
    It makes sense, since in the next patch, we import a victim list in the
    path shrink_extent_tree, we could check if the extent_node is in the victim
    list by checking the list_empty(). So it is necessary to put 1) first.

    Signed-off-by: Hou Pengyang
    Signed-off-by: Jaegeuk Kim

    Hou Pengyang
     

09 Jan, 2016

2 commits


01 Jan, 2016

1 commit


31 Dec, 2015

2 commits

  • Otherwise, we can get mismatched largest extent information.

    One example is:
    1. mount f2fs w/ extent_cache
    2. make a small extent
    3. umount
    4. mount f2fs w/o extent_cache
    5. update the largest extent
    6. umount
    7. mount f2fs w/ extent_cache
    8. get the old extent made by #2

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • If there is no candidates for shrinking slab entries, we don't need to traverse
    any trees at all.

    Reviewed-by: Chao Yu
    [Jaegeuk Kim: fix missing initialization reported by Yunlei He]
    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     

23 Dec, 2015

1 commit


05 Dec, 2015

2 commits

  • For direct IO, f2fs only allocate new address for the block which is not
    exist in the disk before, its mapping info should not exist in extent
    cache previously, so here we do not need to call f2fs_drop_largest_extent
    to drop related cache.

    Due to no more callers for f2fs_drop_largest_extent now, kill it.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • While handling extent trees, we can enter into a reclaiming path anytime.
    If it tries to release some extent nodes in the same extent tree,
    write_lock(&et->lock) would be hanged.
    In order to avoid the deadlock, we can just skip it.

    Note that, if it is an unreferenced tree, we should get write_lock(&et->lock)
    successfully and release all of therein nodes.

    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     

23 Oct, 2015

1 commit


10 Oct, 2015

6 commits

  • This patch adds a new helper __try_update_largest_extent for cleanup.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • Fix 2 potential problems:
    1. when largest extent needs to be invalidated, it will be reset in
    __drop_largest_extent, which makes __is_extent_same after always
    return false, and largest extent unchanged. Now we update it properly.

    2. when extent is split and the latter part remains in tree, next_en
    should be the latter part instead of next extent of original extent.
    It will cause merge failure if there is in-place update, although
    there is not, I think this fix will still makes codes less ambiguous.

    This patch also simplifies codes of invalidating extents, and optimizes the
    procedues that split extent into two.
    There are a few modifications after last patch:
    1. prev_en now is updated properly.
    2. more codes and branches are simplified.

    Signed-off-by: Fan li
    Signed-off-by: Jaegeuk Kim

    Fan Li
     
  • now we update extent by range, fofs may not be on the largest
    extent if the new extent overlaps with it. so add a new function
    to drop largest extent properly.

    Signed-off-by: Fan li
    Signed-off-by: Jaegeuk Kim

    Fan Li
     
  • This function should be static.

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • When shrinking extent cache, we have two steps in the flow:
    1) shrink objects which are unreferenced by inodes;
    2) shrink objects from LRU list of extent cache.

    In step 1, if we haven't shrunk enough number of objects, we will try
    step 2, but before that we didn't update the searching position which
    may point to last inode index in global extent tree, result in failing
    to shrink objects by traversing the all inodes' extent tree.

    In this patch, we reset searching position to beginning of global extent
    tree for fixing.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • Rename trace_f2fs_update_extent_tree to trace_f2fs_update_extent_tree_range,
    then expand and enable it to trace in batches extent info updates.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     

29 Aug, 2015

1 commit

  • If extent cache is disable, we will encounter oops when triggering direct
    IO as below:

    BUG: unable to handle kernel NULL pointer dereference at 0000000c
    IP: [] f2fs_drop_largest_extent+0xe/0x30 [f2fs]
    *pdpt = 000000002bb9a001 *pde = 0000000000000000
    Oops: 0000 [#1] SMP
    Modules linked in: f2fs(O) fuse bnep rfcomm bluetooth nfsd dm_crypt nfs_acl auth_rpcgss oid_registry nfs binfmt_misc fscache lockd
    sunrpc grace snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer
    snd_seq_device snd soundcore joydev psmouse hid_generic i2c_piix4 serio_raw ppdev mac_hid parport_pc lp parport ext4 jbd2 mbcache
    usbhid hid e1000
    CPU: 3 PID: 3608 Comm: dd Tainted: G O 4.2.0-rc4 #12
    Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
    task: ef161600 ti: ebd5e000 task.ti: ebd5e000
    EIP: 0060:[] EFLAGS: 00010202 CPU: 3
    EIP is at f2fs_drop_largest_extent+0xe/0x30 [f2fs]
    EAX: 00000000 EBX: ddebc000 ECX: 00000000 EDX: 00000000
    ESI: ebd5fdf8 EDI: 00000000 EBP: ebd5fd58 ESP: ebd5fd58
    DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
    CR0: 80050033 CR2: 0000000c CR3: 2c24ee40 CR4: 000006f0
    Stack:
    ebd5fda4 f0b8c005 00000000 00000001 00000000 f0b8c430 c816cd68 ddebc000
    ddebc088 00001000 00000555 00000555 ffffffff c160bb00 00055501 00000000
    00000000 00000100 00000000 ebd5fe20 f0b8c430 00000046 ef161600 00001000
    Call Trace:
    [] __allocate_data_block+0x1a5/0x260 [f2fs]
    [] ? f2fs_direct_IO+0x370/0x440 [f2fs]
    [] ? down_read+0x30/0x50
    [] f2fs_direct_IO+0x370/0x440 [f2fs]
    [] generic_file_direct_write+0xa5/0x260
    [] ? current_fs_time+0x18/0x50
    [] __generic_file_write_iter+0xbb/0x210
    [] ? generic_file_write_iter+0x2f/0x320
    [] generic_file_write_iter+0x15c/0x320
    [] f2fs_file_write_iter+0x39/0x80 [f2fs]
    [] __vfs_write+0xa9/0xe0
    [] vfs_write+0x97/0x180
    [] SyS_write+0x5b/0xd0
    [] sysenter_do_call+0x12/0x12
    Code: 10 8b 50 1c 89 53 14 eb ca 8d 74 26 00 85 f6 74 86 eb a6 0f 0b 90 8d b4 26 00 00 00 00 55 89 e5 3e 8d 74 26 00 8b 80 d4 02 00
    00 48 0c 39 d1 77 0e 03 48 14 39 ca 73 07 c7 40 14 00 00 00 00
    EIP: [] f2fs_drop_largest_extent+0xe/0x30 [f2fs] SS:ESP 0068:ebd5fd58
    CR2: 000000000000000c
    ---[ end trace a38c07026a1afffd ]---

    This is because when extent cache is disable, extent_tree pointer in struct
    f2fs_inode_info should be NULL, but in f2fs_drop_largest_extent we access
    this NULL pointer directly without checking state of extent cache, then,
    the oops occurs. Let's fix it by checking state of extent cache before
    accessing.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     

27 Aug, 2015

1 commit

  • This patch introduce a new helper f2fs_update_extent_tree_range which can
    do extent mapping update at a specified range.

    The main idea is:
    1) punch all mapping info in extent node(s) which are at a specified range;
    2) try to merge new extent mapping with adjacent node, or failing that,
    insert the mapping into extent tree as a new node.

    In order to see the benefit, I add a function for stating time stamping
    count as below:

    uint64_t rdtsc(void)
    {
    uint32_t lo, hi;
    __asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi));
    return (uint64_t)hi << 32 | lo;
    }

    My test environment is: ubuntu, intel i7-3770, 16G memory, 256g micron ssd.

    truncation path: update extent cache from truncate_data_blocks_range
    non-truncataion path: update extent cache from other paths
    total: all update paths

    a) Removing 128MB file which has one extent node mapping whole range of
    file:
    1. dd if=/dev/zero of=/mnt/f2fs/128M bs=1M count=128
    2. sync
    3. rm /mnt/f2fs/128M

    Before:
    total count average
    truncation: 7651022 32768 233.49

    Patched:
    total count average
    truncation: 3321 33 100.64

    b) fsstress:
    fsstress -d /mnt/f2fs -l 5 -n 100 -p 20
    Test times: 5 times.

    Before:
    total count average
    truncation: 5812480.6 20911.6 277.95
    non-truncation: 7783845.6 13440.8 579.12
    total: 13596326.2 34352.4 395.79

    Patched:
    total count average
    truncation: 1281283.0 3041.6 421.25
    non-truncation: 7355844.4 13662.8 538.38
    total: 8637127.4 16704.4 517.06

    1) For the updates in truncation path:
    - we can see updating in batches leads total tsc and update count reducing
    explicitly;
    - besides, for a single batched updating, punching multiple extent nodes
    in a loop, result in executing more operations, so our average tsc
    increase intensively.
    2) For the updates in non-truncation path:
    - there is a little improvement, that is because for the scenario that we
    just need to update in the head or tail of extent node, new interface
    optimize to update info in extent node directly, rather than removing
    original extent node for updating and then inserting that updated one
    into cache as new node.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     

22 Aug, 2015

6 commits

  • In __lookup_extent_tree_ret we will not try to find neighbor nodes if
    we find the target node, in this condition, we will lost the chance to
    merge the new mapping with exist extent node later.

    So our extent cache of inode will be fragmented after overwrite exist
    file, we can see the number of extent node increases intensively in
    following test case:

    dd if=/dev/zero of=/mnt/f2fs/4m bs=4K count=1024

    Extent Cache:
    - Hit Count: L1-1:0 L1-2:0 L2:0
    - Hit Ratio: 0% (0 / 3072)
    - Inner Struct Count: tree: 1, node: 1

    dd if=/dev/zero of=/mnt/f2fs/4m bs=4K count=1024 conv=notrunc

    Extent Cache:
    - Hit Count: L1-1:2048 L1-2:0 L2:0
    - Hit Ratio: 33% (2048 / 6144)
    - Inner Struct Count: tree: 1, node: 961

    This patch fixes to lookup neighbors of target node for further
    merging.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • This patch splits __insert_extent_tree_ret into __try_merge_extent_node &
    __insert_extent_tree for code readability.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • After commit 0f825ee6e873 ("f2fs: add new interfaces for extent tree"),
    f2fs_init_extent_tree becomes the only caller of __insert_extent_tree, and
    in f2fs_init_extent_tree, we will only insert extent node in an empty tree,
    so __try_{back,front}_merge in __insert_extent_tree will never be called.

    This patch removes these dead codes, besides, rename __insert_extent_tree
    to __init_extent_tree for readability.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • This patch alters to replace total hit stat with rbtree hit stat,
    and then adjust showing of extent cache stat:

    Hit Count:
    L1-1: for largest node hit count;
    L1-2: for last cached node hit count;
    L2: for extent node hit after lookuping in rbtree.

    Hit Ratio:
    ratio (hit count / total lookup count)

    Inner Struct Count:
    tree count, node count.

    Before:
    Extent Hit Ratio: 0 / 2

    Extent Tree Count: 3

    Extent Node Count: 2

    Patched:
    Exten Cacache:
    - Hit Count: L1-1:4871 L1-2:2074 L2:208
    - Hit Ratio: 1% (7153 / 550751)
    - Inner Struct Count: tree: 26560, node: 11824

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • This patch adds to stat the hit count of largest/cached node for showing
    in debugfs.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • In f2fs_lookup_extent_tree, et->cached_en was read and updated with only
    read lock held,
    it could cause __lookup_extent_tree within return entirely wrong
    extent_node, if other
    thread update et->cached_en just before __lookup_extent_tree return.

    However, there are two things about this patch that need to be noticed:
    1. It does no good to arrange the order of concurrent read/write, the result
    would still
    be random in such case.
    2. It's built on this assumption: the mix up of reads and writes on a single
    pointer would
    not make the pointer partially wrong at any time. Please let me know if I'm
    wrong, thx.

    Signed-off-by: Fan li
    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Fan Li
     

05 Aug, 2015

3 commits

  • Add a lookup and a insertion interface for extent tree.
    The new lookup return the insert position and the prev/next
    extents closest to the offset we lookup when find no match.
    The new insertion uses above parameters to improve performance.

    There are three possible insertions after the lookup in
    f2fs_update_extent_tree, two of them insert parts of removed extent
    back to tree, since no merge happens during this process, new insertion
    skips the merge check in this scanario; the another insertion inserts a
    new extent to tree, new insertion uses prev/next extent and insert
    position to insert this extent directly, and save the time of searching
    down the tree.

    As long as tree remains unchanged between lookup and insertion, this
    would work fine. And the new lookup would be useful when add
    multi-blocks extent support for insertion interface.

    Signed-off-by: Fan li
    Signed-off-by: Jaegeuk Kim

    Fan Li
     
  • Variables for recording extent cache ratio info were updated without
    protection, this patch tries to alter them to atomic_t type for more
    accurate stat.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • This patch moves extent cache related code from data.c into extent_cache.c
    since extent cache is independent feature, and its codes are not relate to
    others in data.c, it's better for us to maintain them in separated place.

    There is no functionality change, but several small coding style fixes
    including:
    * rename __drop_largest_extent to f2fs_drop_largest_extent for exporting;
    * rename misspelled word 'untill' to 'until';
    * remove unneeded 'return' in the end of f2fs_destroy_extent_tree().

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu