04 Sep, 2015

1 commit

  • Pull f2fs updates from Jaegeuk Kim:
    "The major work includes fixing and enhancing the existing extent_cache
    feature, which has been well settling down so far and now it becomes a
    default mount option accordingly.

    Also, this version newly registers a f2fs memory shrinker to reclaim
    several objects consumed by a couple of data structures in order to
    avoid memory pressures.

    Another new feature is to add ioctl(F2FS_GARBAGE_COLLECT) which
    triggers a cleaning job explicitly by users.

    Most of the other patches are to fix bugs occurred in the corner cases
    across the whole code area"

    * tag 'for-f2fs-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (85 commits)
    f2fs: upset segment_info repair
    f2fs: avoid accessing NULL pointer in f2fs_drop_largest_extent
    f2fs: update extent tree in batches
    f2fs: fix to release inode correctly
    f2fs: handle f2fs_truncate error correctly
    f2fs: avoid unneeded initializing when converting inline dentry
    f2fs: atomically set inode->i_flags
    f2fs: fix wrong pointer access during try_to_free_nids
    f2fs: use __GFP_NOFAIL to avoid infinite loop
    f2fs: lookup neighbor extent nodes for merging later
    f2fs: split __insert_extent_tree_ret for readability
    f2fs: kill dead code in __insert_extent_tree
    f2fs: adjust showing of extent cache stat
    f2fs: add largest/cached stat in extent cache
    f2fs: fix incorrect mapping for bmap
    f2fs: add annotation for space utilization of regular/inline dentry
    f2fs: fix to update cached_en of extent tree properly
    f2fs: fix typo
    f2fs: check the node block address of newly allocated nid
    f2fs: go out for insert_inode_locked failure
    ...

    Linus Torvalds
     

03 Sep, 2015

1 commit

  • Pull core block updates from Jens Axboe:
    "This first core part of the block IO changes contains:

    - Cleanup of the bio IO error signaling from Christoph. We used to
    rely on the uptodate bit and passing around of an error, now we
    store the error in the bio itself.

    - Improvement of the above from myself, by shrinking the bio size
    down again to fit in two cachelines on x86-64.

    - Revert of the max_hw_sectors cap removal from a revision again,
    from Jeff Moyer. This caused performance regressions in various
    tests. Reinstate the limit, bump it to a more reasonable size
    instead.

    - Make /sys/block//queue/discard_max_bytes writeable, by me.
    Most devices have huge trim limits, which can cause nasty latencies
    when deleting files. Enable the admin to configure the size down.
    We will look into having a more sane default instead of UINT_MAX
    sectors.

    - Improvement of the SGP gaps logic from Keith Busch.

    - Enable the block core to handle arbitrarily sized bios, which
    enables a nice simplification of bio_add_page() (which is an IO hot
    path). From Kent.

    - Improvements to the partition io stats accounting, making it
    faster. From Ming Lei.

    - Also from Ming Lei, a basic fixup for overflow of the sysfs pending
    file in blk-mq, as well as a fix for a blk-mq timeout race
    condition.

    - Ming Lin has been carrying Kents above mentioned patches forward
    for a while, and testing them. Ming also did a few fixes around
    that.

    - Sasha Levin found and fixed a use-after-free problem introduced by
    the bio->bi_error changes from Christoph.

    - Small blk cgroup cleanup from Viresh Kumar"

    * 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 commits)
    blk: Fix bio_io_vec index when checking bvec gaps
    block: Replace SG_GAPS with new queue limits mask
    block: bump BLK_DEF_MAX_SECTORS to 2560
    Revert "block: remove artifical max_hw_sectors cap"
    blk-mq: fix race between timeout and freeing request
    blk-mq: fix buffer overflow when reading sysfs file of 'pending'
    Documentation: update notes in biovecs about arbitrarily sized bios
    block: remove bio_get_nr_vecs()
    fs: use helper bio_add_page() instead of open coding on bi_io_vec
    block: kill merge_bvec_fn() completely
    md/raid5: get rid of bio_fits_rdev()
    md/raid5: split bio for chunk_aligned_read
    block: remove split code in blkdev_issue_{discard,write_same}
    btrfs: remove bio splitting and merge_bvec_fn() calls
    bcache: remove driver private bio splitting code
    block: simplify bio_add_page()
    block: make generic_make_request handle arbitrarily sized bios
    blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL)
    block: don't access bio->bi_error after bio_put()
    block: shrink struct bio down to 2 cache lines again
    ...

    Linus Torvalds
     

22 Aug, 2015

1 commit

  • The test step is like below:
    1. touch file
    2. truncate -s $((1024*1024)) file
    3. fallocate -o 0 -l $((1024*1024)) file
    4. fibmap.f2fs file

    Our result of fibmap.f2fs showed below is not correct:

    file_pos start_blk end_blk blks
    0 -937166132 -937166132 1
    4096 -937166132 -937166132 1
    8192 -937166132 -937166132 1
    12288 -937166132 -937166132 1
    16384 -937166132 -937166132 1
    20480 -937166132 -937166132 1
    ...
    1040384 -937166132 -937166132 1
    1044480 -937166132 -937166132 1

    This is because f2fs_map_blocks will return with no error when meeting
    a hole or preallocated block, the caller __get_data_block will map the
    uninitialized variable value to bh->b_blocknr.

    Unfortunately generic_block_bmap will neither check the return value of
    get_data() nor check mapping info of buffer_head, result in returning
    the random block address.

    After fixing the issue, our result shows correctly:

    file_pos start_blk end_blk blks
    0 0 0 256

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     

21 Aug, 2015

1 commit

  • As the below comment of bio_alloc_bioset, f2fs can allocate multiple bios at the
    same time. So, we can't guarantee that bio is allocated all the time.

    "
    * When @bs is not NULL, if %__GFP_WAIT is set then bio_alloc will always be
    * able to allocate a bio. This is due to the mempool guarantees. To make this
    * work, callers must never allocate more than 1 bio at a time from this pool.
    * Callers that need to allocate more than 1 bio must always submit the
    * previously allocated bio for IO before attempting to allocate a new one.
    * Failure to do so can cause deadlocks under memory pressure.
    "

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     

14 Aug, 2015

1 commit

  • We can always fill up the bio now, no need to estimate the possible
    size based on queue parameters.

    Acked-by: Steven Whitehouse
    Signed-off-by: Kent Overstreet
    [hch: rebased and wrote a changelog]
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Ming Lin
    Signed-off-by: Jens Axboe

    Kent Overstreet
     

12 Aug, 2015

2 commits

  • Previously, we use radix tree to index all registered page entries for
    atomic file, but now we only use radix tree to see whether current page
    is indexed or not, since the other user of radix tree is gone in commit
    042b7816aaeb ("f2fs: remove unnecessary call to invalidate inmemory pages").

    So in this patch, we try to use one more efficient way:
    Introducing a macro ATOMIC_WRITTEN_PAGE, and setting it as page private
    value to indicate page indexing status. By using this way, we can save
    memory and lookup time.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • We run ltp testcase with f2fs and obtain a TFAIL in diotest4, the result in
    detail is as fallow:

    dio04

    <<>>
    tag=dio04 stime=1432278894
    cmdline="diotest4"
    contacts=""
    analysis=exit
    <<>>
    diotest4 1 TPASS : Negative Offset
    diotest4 2 TPASS : removed
    diotest4 3 TFAIL : diotest4.c:129: write allows odd count.returns 1: Success
    diotest4 4 TFAIL : diotest4.c:183: Odd count of read and write
    diotest4 5 TPASS : Read beyond the file size
    ......

    the result of ext4 with same environment:

    dio04

    <<>>
    tag=dio04 stime=1432259643
    cmdline="diotest4"
    contacts=""
    analysis=exit
    <<>>
    diotest4 1 TPASS : Negative Offset
    diotest4 2 TPASS : removed
    diotest4 3 TPASS : Odd count of read and write
    diotest4 4 TPASS : Read beyond the file size
    ......

    The reason is that when triggering DIO in f2fs, we will return zero value
    in ->direct_IO if writer's buffer offset, file offset and transfer size is
    not alignment to block size of filesystem, resulting in falling back into
    buffered write instead of returning -EINVAL.

    This patch fixes that problem by returning correct error number for above
    case, and removing the judgement condition in check_direct_IO to make sure
    the verification will be enabled for direct reader too.

    Besides, Jaegeuk Kim pointed out that there is expectional cases we should
    always make direct-io falling back into buffered write, such as dio in
    encrypted file.

    Signed-off-by: Yunlei He
    [Chao Yu make small change and add detail description in commit message]
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     

06 Aug, 2015

1 commit


05 Aug, 2015

17 commits

  • In following call path, we will pass a locked and referenced ipage
    pointer to get_new_data_page:
    - init_inode_metadata
    - make_empty_dir
    - get_new_data_page

    There are two exit paths in get_new_data_page when error occurs:
    1) grab_cache_page fails, ipage will not be released;
    2) f2fs_reserve_block fails, ipage will be released in callee.

    So, it's not consistent for error handling in get_new_data_page.

    For f2fs_reserve_block, it's not very easy to change the rule
    of error handling, since it's already complicated.

    Here we deside to choose an easy way to fix this issue:
    If any error occur in get_new_data_page, we will ensure releasing
    ipage in this function.

    The same issue is in f2fs_convert_inline_dir, fix that too.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • some backing devices need pages to be stable during writeback. It doesn't
    matter if
    the page is completely overwritten or already uptodate, it needs to wait
    before write.

    Signed-off-by: Fan li
    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Fan Li
     
  • When flushing comes from background, if there is no dirty page in the
    mapping of inode, we'd better to skip seeking dirty page from mapping
    for writebacking.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • The if statement "goto continue_unlock" is exactly the same when
    each if condition is true that is depended on the value of both
    "step" and "is_cold_data(page)" are 0 or 1. That means when the
    value of "step" equals to "is_cold_data(page)", the if condition
    is true and the if statement "goto continue_unlock" appears only
    once, so it can be optimized to reduce the duplicated code.

    Signed-off-by: Tiezhu Yang
    Signed-off-by: Jaegeuk Kim

    Tiezhu Yang
     
  • This patch changes for a caller to handle the page after its bio gets an error.

    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • If there are gced dirty pages and normal dirty pages in the mapping
    of one inode, we might writeback them alternately with discontinuous
    block address, resulting in low performance.

    This patch introduces f2fs_write_cache_pages with codes copied from
    write_cache_pages in mm/page-writeback.c.

    In this function, we refactor flow with two steps:
    1) writeback all cold type pages.
    2) writeback all non-cold type pages.

    By using this method, f2fs will writeback dirty pages with the same
    temperature in bunch mode, it makes writeouted block being with
    more continuous address, so they can be merged as much as possible
    in f2fs bio cache, and also it will reduce the chance of submiting
    small IO from block layer.

    Test environment: 8g nokia sd card (very old sd card, but it shows
    better effect when testing with this patch, and with a 32g kingston
    sd card, I didn't see much more improvement).

    Test step:
    1. touch testfile;
    2. truncate -s 512K testfile;
    3. write all pages with odd index;
    4. trigger gc by ioctl;
    5. write all pages with even index;
    6. time fsync testfile.

    before:
    real 0m0.402s
    user 0m0.000s
    sys 0m0.000s

    after:
    real 0m0.143s
    user 0m0.004s
    sys 0m0.004s

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • This patch moves extent cache related code from data.c into extent_cache.c
    since extent cache is independent feature, and its codes are not relate to
    others in data.c, it's better for us to maintain them in separated place.

    There is no functionality change, but several small coding style fixes
    including:
    * rename __drop_largest_extent to f2fs_drop_largest_extent for exporting;
    * rename misspelled word 'untill' to 'until';
    * remove unneeded 'return' in the end of f2fs_destroy_extent_tree().

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • Since only parts of extents longer than F2FS_MIN_EXTENT_LEN will
    be kept in extent cache after split, extents already shorter than
    F2FS_MIN_EXTENT_LEN don't need to try split at all.

    Signed-off-by: Fan Li
    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Fan Li
     
  • This patch fixes to update page flag (e.g. Uptodate/cold flag) in
    ->write_begin.

    Otherwise, page will be non-uptodate when we try to write entire
    page, and cold data flag in page will not be clean when gced page
    is being rewritten.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • If an extent_tree entry has a zero reference count, we can drop it from the
    cache in higher priority rather than currently referencing entries.

    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • In ->writepages, we use writepages mutex lock to serialize all block
    address allocation and page submitting pairs from different inodes.
    This method makes our delayed dirty pages of one inode being written
    continously as many as possible.

    But there is one problem that we did not submit current cached bio in
    protection region of writepages mutex lock, so there is a small chance
    that we submit the one of other thread's as below, resulting in
    splitting more bios.

    thread 1 thread 2
    ->writepages
    lock(writepages)
    ->write_cache_pages
    unlock(writepages)
    lock(writepages)
    ->write_cache_pages
    ->f2fs_submit_merged_bio
    ->writepage
    unlock(writepages)

    fs_mark-6535 [002] .... 2242.270230: f2fs_submit_write_bio: dev = (1,0), WRITE_SYNC, DATA, sector = 5766152, size = 524288
    fs_mark-6536 [000] .... 2242.270361: f2fs_submit_write_bio: dev = (1,0), WRITE_SYNC, DATA, sector = 5767176, size = 4096
    fs_mark-6536 [000] .... 2242.270370: f2fs_submit_write_bio: dev = (1,0), WRITE_SYNC, NODE, sector = 8138112, size = 4096
    fs_mark-6535 [002] .... 2242.270776: f2fs_submit_write_bio: dev = (1,0), WRITE_SYNC, DATA, sector = 5767184, size = 516096

    This may really increase time of block layer works, and may cause
    larger IO lantency.

    This patch moves the submitting operation into region of writepages
    mutex lock to avoid bio splits when concurrently writebacking is
    intensive.

    my test environment: virtual machine,
    intel cpu i5 2500, 8GB size memory, 4GB size ramdisk

    time fs_mark -t 16 -L 1 -s 524288 -S 1 -d /mnt/f2fs/

    before:
    real 0m4.244s
    user 0m0.088s
    sys 0m12.336s

    after:
    real 0m3.822s
    user 0m0.072s
    sys 0m10.760s

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • Because of the extent shrinker or other -ENOMEM scenarios, it cannot guarantee
    that the largest extent would be cached in the tree all the time.

    Instead of relying on extent_tree, we can simply check the cached one in extent
    tree accordingly.

    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • We don't need to handle the duplicate extent information.

    The integrated rule is:
    - update on-disk extent with largest one tracked by in-memory extent_cache
    - destroy extent_tree for the truncation case
    - drop per-inode extent_cache by shrinker

    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • This patch registers shrinking extent_caches.

    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • This patch relocates cached_en not only to be covered by spin_lock, but also
    to set once after checking out completely.

    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • Previously, f2fs_update_extent_cache() updates in-memory extent_cache all the
    time, and then finally preserves its up-to-date extent into on-disk one during
    f2fs_evict_inode.

    But, in the following scenario:

    1. mount
    2. open & write an extent X
    3. f2fs_evict_inode; on-disk extent is X
    4. open & update the extent X with Y
    5. sync; trigger checkpoint
    6. power-cut

    after power-on, f2fs should serve extent Y, but we have an on-disk extent X.

    This causes a failure on xfstests/311.

    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • This patch fixes wrong calculation on block address field when an extent is
    split.

    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     

29 Jul, 2015

1 commit

  • Currently we have two different ways to signal an I/O error on a BIO:

    (1) by clearing the BIO_UPTODATE flag
    (2) by returning a Linux errno value to the bi_end_io callback

    The first one has the drawback of only communicating a single possible
    error (-EIO), and the second one has the drawback of not beeing persistent
    when bios are queued up, and are not passed along from child to parent
    bio in the ever more popular chaining scenario. Having both mechanisms
    available has the additional drawback of utterly confusing driver authors
    and introducing bugs where various I/O submitters only deal with one of
    them, and the others have to add boilerplate code to deal with both kinds
    of error returns.

    So add a new bi_error field to store an errno value directly in struct
    bio and remove the existing mechanisms to clean all this up.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Hannes Reinecke
    Reviewed-by: NeilBrown
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

25 Jul, 2015

1 commit

  • The cgroup attaches inode->i_wb via mark_inode_dirty and when set_page_writeback
    is called, __inc_wb_stat() updates i_wb's stat.

    So, we need to explicitly call set_page_dirty->__mark_inode_dirty in prior to
    any writebacking pages.

    This patch should resolve the following kernel panic reported by Andreas Reis.

    https://bugzilla.kernel.org/show_bug.cgi?id=101801

    --- Comment #2 from Andreas Reis ---
    BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
    IP: [] __percpu_counter_add+0x1a/0x90
    PGD 2951ff067 PUD 2df43f067 PMD 0
    Oops: 0000 [#1] PREEMPT SMP
    Modules linked in:
    CPU: 7 PID: 10356 Comm: gcc Tainted: G W 4.2.0-1-cu #1
    Hardware name: Gigabyte Technology Co., Ltd. G1.Sniper M5/G1.Sniper M5, BIOS
    T01 02/03/2015
    task: ffff880295044f80 ti: ffff880295140000 task.ti: ffff880295140000
    RIP: 0010:[] []
    __percpu_counter_add+0x1a/0x90
    RSP: 0018:ffff880295143ac8 EFLAGS: 00010082
    RAX: 0000000000000003 RBX: ffffea000a526d40 RCX: 0000000000000001
    RDX: 0000000000000020 RSI: 0000000000000001 RDI: 0000000000000088
    RBP: ffff880295143ae8 R08: 0000000000000000 R09: ffff88008f69bb30
    R10: 00000000fffffffa R11: 0000000000000000 R12: 0000000000000088
    R13: 0000000000000001 R14: ffff88041d099000 R15: ffff880084a205d0
    FS: 00007f8549374700(0000) GS:ffff88042f3c0000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000000000a8 CR3: 000000033e1d5000 CR4: 00000000001406e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Stack:
    0000000000000000 ffffea000a526d40 ffff880084a20738 ffff880084a20750
    ffff880295143b48 ffffffff811cc91e ffff880000000000 0000000000000296
    0000000000000000 ffff880417090198 0000000000000000 ffffea000a526d40
    Call Trace:
    [] __test_set_page_writeback+0xde/0x1d0
    [] do_write_data_page+0xe7/0x3a0
    [] gc_data_segment+0x5aa/0x640
    [] do_garbage_collect+0x138/0x150
    [] f2fs_gc+0x1be/0x3e0
    [] f2fs_balance_fs+0x81/0x90
    [] f2fs_unlink+0x47/0x1d0
    [] vfs_unlink+0x109/0x1b0
    [] do_unlinkat+0x287/0x2c0
    [] SyS_unlink+0x16/0x20
    [] entry_SYSCALL_64_fastpath+0x12/0x71
    Code: 41 5e 5d c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 55 49
    89 f5 41 54 49 89 fc 53 48 83 ec 08 65 ff 05 e6 d9 b6 7e 8b 47 20 48 63 ca
    65 8b 18 48 63 db 48 01 f3 48 39 cb 7d 0a
    RIP [] __percpu_counter_add+0x1a/0x90
    RSP
    CR2: 00000000000000a8
    ---[ end trace 5132449a58ed93a3 ]---
    note: gcc[10356] exited with preempt_count 2

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     

02 Jun, 2015

1 commit


29 May, 2015

8 commits

  • This patch adds encryption support in read and write paths.

    Note that, in f2fs, we need to consider cleaning operation.
    In cleaning procedure, we must avoid encrypting and decrypting written blocks.
    So, this patch implements move_encrypted_block().

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • This patch activates the following APIs for encryption support.

    The rules quoted by ext4 are:
    - An unencrypted directory may contain encrypted or unencrypted files
    or directories.
    - All files or directories in a directory must be protected using the
    same key as their containing directory.
    - Encrypted inode for regular file should not have inline_data.
    - Encrypted symlink and directory may have inline_data and inline_dentry.

    This patch activates the following APIs.
    1. f2fs_link : validate context
    2. f2fs_lookup : ''
    3. f2fs_rename : ''
    4. f2fs_create/f2fs_mkdir : inherit its dir's context
    5. f2fs_direct_IO : do buffered io for regular files
    6. f2fs_open : check encryption info
    7. f2fs_file_mmap : ''
    8. f2fs_setattr : ''
    9. f2fs_file_write_iter : '' (Called by sys_io_submit)
    10. f2fs_fallocate : do not support fcollapse
    11. f2fs_evict_inode : free_encryption_info

    Signed-off-by: Michael Halcrow
    Signed-off-by: Theodore Ts'o
    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • This patch slightly changes f2fs_fiemap function to report unwritten area.

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • This patch splits find_data_page as follows.

    1. f2fs_gc
    - use get_read_data_page() with read only

    2. find_in_level
    - use find_data_page without locked page

    3. truncate_partial_page
    - In the case cache_only mode, just drop cached page.
    - Ohterwise, use get_lock_data_page() and guarantee to truncate

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • There are two threads:
    f2fs_delete_entry() get_new_data_page()
    f2fs_reserve_block()
    dn.blkaddr = XXX
    lock_page(dentry_block)
    truncate_hole()
    dn.blkaddr = NULL
    unlock_page(dentry_block)
    lock_page(dentry_block)
    fill the block from XXX address
    add new dentries
    unlock_page(dentry_block)

    Later, f2fs_write_data_page() will truncate the dentry_block, since
    its block address is NULL.

    The reason for this was due to the wrong lock order.
    In this case, we should do f2fs_reserve_block() after locking its dentry block.

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • This patch adds f2fs_sb_info and page pointers in f2fs_io_info structure.
    With this change, we can reduce a lot of parameters for IO functions.

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • This patch implements f2fs_mpage_readpages for further optimization on
    encryption support.

    The basic code was taken from fs/mpage.c, and changed to be simple by adjusting
    that block_size is equal to page_size in f2fs.

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     
  • This patch introduces f2fs_map_blocks structure likewise ext4_map_blocks.
    Now, f2fs uses f2fs_map_blocks when handling get_block.

    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     

05 May, 2015

1 commit

  • This reports performance regression by Yuanhan Liu.
    The basic idea was to reduce one-point mutex, but it turns out this causes
    another contention like context swithes.

    https://lkml.org/lkml/2015/4/21/11

    Until finishing the analysis on this issue, I'd like to revert this for a while.

    This reverts commit 78373b7319abdf15050af5b1632c4c8b8b398f33.

    Jaegeuk Kim
     

18 Apr, 2015

1 commit

  • Pull f2fs updates from Jaegeuk Kim:
    "New features:
    - in-memory extent_cache
    - fs_shutdown to test power-off-recovery
    - use inline_data to store symlink path
    - show f2fs as a non-misc filesystem

    Major fixes:
    - avoid CPU stalls on sync_dirty_dir_inodes
    - fix some power-off-recovery procedure
    - fix handling of broken symlink correctly
    - fix missing dot and dotdot made by sudden power cuts
    - handle wrong data index during roll-forward recovery
    - preallocate data blocks for direct_io

    ... and a bunch of minor bug fixes and cleanups"

    * tag 'for-f2fs-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (71 commits)
    f2fs: pass checkpoint reason on roll-forward recovery
    f2fs: avoid abnormal behavior on broken symlink
    f2fs: flush symlink path to avoid broken symlink after POR
    f2fs: change 0 to false for bool type
    f2fs: do not recover wrong data index
    f2fs: do not increase link count during recovery
    f2fs: assign parent's i_mode for empty dir
    f2fs: add F2FS_INLINE_DOTS to recover missing dot dentries
    f2fs: fix mismatching lock and unlock pages for roll-forward recovery
    f2fs: fix sparse warnings
    f2fs: limit b_size of mapped bh in f2fs_map_bh
    f2fs: persist system.advise into on-disk inode
    f2fs: avoid NULL pointer dereference in f2fs_xattr_advise_get
    f2fs: preallocate fallocated blocks for direct IO
    f2fs: enable inline data by default
    f2fs: preserve extent info for extent cache
    f2fs: initialize extent tree with on-disk extent info of inode
    f2fs: introduce __{find,grab}_extent_tree
    f2fs: split set_data_blkaddr from f2fs_update_extent_cache
    f2fs: enable fast symlink by utilizing inline data
    ...

    Linus Torvalds
     

12 Apr, 2015

2 commits