09 Jul, 2020

1 commit

  • Wire up f2fs to support inline encryption via the helper functions which
    fs/crypto/ now provides. This includes:

    - Adding a mount option 'inlinecrypt' which enables inline encryption
    on encrypted files where it can be used.

    - Setting the bio_crypt_ctx on bios that will be submitted to an
    inline-encrypted file.

    - Not adding logically discontiguous data to bios that will be submitted
    to an inline-encrypted file.

    - Not doing filesystem-layer crypto on inline-encrypted files.

    This patch includes a fix for a race during IPU by
    Sahitya Tummala

    Signed-off-by: Satya Tangirala
    Acked-by: Jaegeuk Kim
    Reviewed-by: Eric Biggers
    Reviewed-by: Chao Yu
    Link: https://lore.kernel.org/r/20200702015607.1215430-4-satyat@google.com
    Co-developed-by: Eric Biggers
    Signed-off-by: Eric Biggers

    Satya Tangirala
     

09 Jun, 2020

1 commit


29 May, 2020

1 commit

  • While compressed data writeback, we need to drop dirty pages like we did
    for non-compressed pages if cp stops, however it's not needed to compress
    any data in such case, so let's detect cp stop condition in
    cluster_may_compress() to avoid redundant compressing and let following
    f2fs_write_raw_pages() drops dirty pages correctly.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     

12 May, 2020

4 commits

  • During zstd compression, ZSTD_endStream() may return non-zero value
    because distination buffer is full, but there is still compressed data
    remained in intermediate buffer, it means that zstd algorithm can not
    save at last one block space, let's just writeback raw data instead of
    compressed one, this can fix data corruption when decompressing
    incomplete stored compression data.

    Fixes: 50cfa66f0de0 ("f2fs: compress: support zstd compress algorithm")
    Signed-off-by: Daeho Jeong
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • Commonly, in order to handle lz4 worst compress case, caller should
    allocate buffer with size of LZ4_compressBound(inputsize) for target
    compressed data storing, however in this case, if caller didn't
    allocate enough space, lz4 compressor still can handle output buffer
    budget properly, and end up compressing when left space in output
    buffer is not enough.

    So we don't have to allocate buffer with size for worst case, then
    we can avoid 2 * 4KB size intermediate buffer allocation when
    log_cluster_size is 2, and avoid unnecessary compressing work of
    compressor if we can not save at least 4KB space.

    Suggested-by: Daeho Jeong
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • LZO-RLE extension (run length encoding) was introduced to improve
    performance of LZO algorithm in scenario of data contains many zeros,
    zram has changed to use this extended algorithm by default, this
    patch adds to support this algorithm extension, to enable this
    extension, it needs to enable F2FS_FS_LZO and F2FS_FS_LZORLE config,
    and specifies "compress_algorithm=lzo-rle" mountoption.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • If compression feature is on, in scenario of no enough free memory,
    page refault ratio is higher than before, the root cause is:
    - {,de}compression flow needs to allocate intermediate pages to store
    compressed data in cluster, so during their allocation, vm may reclaim
    mmaped pages.
    - if above reclaimed pages belong to compressed cluster, during its
    refault, it may cause more intermediate pages allocation, result in
    reclaiming more mmaped pages.

    So this patch introduces a mempool for intermediate page allocation,
    in order to avoid high refault ratio, by default, number of
    preallocated page in pool is 512, user can change the number by
    assigning 'num_compress_pages' parameter during module initialization.

    Ma Feng found warnings in the original patch and fixed like below.

    Fix the following sparse warning:
    fs/f2fs/compress.c:501:5: warning: symbol 'num_compress_pages' was not declared.
    Should it be static?
    fs/f2fs/compress.c:530:6: warning: symbol 'f2fs_compress_free_page' was not
    declared. Should it be static?

    Reported-by: Hulk Robot
    Signed-off-by: Chao Yu
    Signed-off-by: Ma Feng
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     

08 May, 2020

1 commit


24 Apr, 2020

1 commit

  • f2fs_quota_sync() uses f2fs_lock_op() before flushing dirty pages, but
    f2fs_write_data_page() returns EAGAIN.
    Likewise dentry blocks, we can just bypass getting the lock, since quota
    blocks are also maintained by checkpoint.

    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Jaegeuk Kim
     

04 Apr, 2020

3 commits


31 Mar, 2020

6 commits

  • Add below two callback interfaces in struct f2fs_compress_ops:

    int (*init_decompress_ctx)(struct decompress_io_ctx *dic);
    void (*destroy_decompress_ctx)(struct decompress_io_ctx *dic);

    Which will be used by zstd compress algorithm later.

    In addition, this patch adds callback function pointer check, so that
    specified algorithm can avoid defining unneeded functions.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • Otherwise, it will cause memory leak.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • {cic,dic}.ref should be initialized to number of compressed pages,
    let's initialize it directly rather than doing w/
    f2fs_set_compressed_page().

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • If both compression and fsverity feature is on, generic/572 will
    report below NULL pointer dereference bug.

    BUG: kernel NULL pointer dereference, address: 0000000000000018
    RIP: 0010:f2fs_verity_work+0x60/0x90 [f2fs]
    #PF: supervisor read access in kernel mode
    Workqueue: fsverity_read_queue f2fs_verity_work [f2fs]
    RIP: 0010:f2fs_verity_work+0x60/0x90 [f2fs]
    Call Trace:
    process_one_work+0x16c/0x3f0
    worker_thread+0x4c/0x440
    ? rescuer_thread+0x350/0x350
    kthread+0xf8/0x130
    ? kthread_unpark+0x70/0x70
    ret_from_fork+0x35/0x40

    There are two issue in f2fs_verity_work():
    - it needs to traverse and verify all pages in bio.
    - if pages in bio belong to non-compressed cluster, accessing
    decompress IO context stored in page private will cause NULL
    pointer dereference.

    Fix them.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • In f2fs_decompress_end_io(), we should clear PG_error flag before page
    unlock, otherwise reread will fail due to the flag as described in
    commit fb7d70db305a ("f2fs: clear PageError on the read path").

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • generic/232 reports below deadlock:

    fsstress D 0 96980 96969 0x00084000
    Call Trace:
    schedule+0x4a/0xb0
    io_schedule+0x12/0x40
    __lock_page+0x127/0x1d0
    pagecache_get_page+0x1d8/0x250
    prepare_compress_overwrite+0xe0/0x490 [f2fs]
    f2fs_prepare_compress_overwrite+0x5d/0x80 [f2fs]
    f2fs_write_begin+0x833/0xb90 [f2fs]
    f2fs_quota_write+0x145/0x1e0 [f2fs]
    write_blk+0x36/0x80 [quota_tree]
    do_insert_tree+0x2ac/0x4a0 [quota_tree]
    do_insert_tree+0x26e/0x4a0 [quota_tree]
    qtree_write_dquot+0x70/0x190 [quota_tree]
    v2_write_dquot+0x43/0x90 [quota_v2]
    dquot_acquire+0x77/0x100
    f2fs_dquot_acquire+0x2f/0x60 [f2fs]
    dqget+0x310/0x450
    dquot_transfer+0xb2/0x120
    f2fs_setattr+0x11a/0x4a0 [f2fs]
    notify_change+0x349/0x480
    chown_common+0x168/0x1c0
    do_fchownat+0xbc/0xf0
    __x64_sys_lchown+0x21/0x30
    do_syscall_64+0x5f/0x220
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

    task PC stack pid father
    kworker/u256:0 D 0 103444 2 0x80084000
    Workqueue: writeback wb_workfn (flush-251:1)
    Call Trace:
    schedule+0x4a/0xb0
    schedule_timeout+0x15e/0x2f0
    io_schedule_timeout+0x19/0x40
    congestion_wait+0x7e/0x120
    f2fs_write_multi_pages+0x12a/0x840 [f2fs]
    f2fs_write_cache_pages+0x48f/0x790 [f2fs]
    f2fs_write_data_pages+0x2db/0x330 [f2fs]
    do_writepages+0x1a/0x60
    __writeback_single_inode+0x3d/0x340
    writeback_sb_inodes+0x225/0x4a0
    wb_writeback+0xf7/0x320
    wb_workfn+0xba/0x470
    process_one_work+0x16c/0x3f0
    worker_thread+0x4c/0x440
    kthread+0xf8/0x130
    ret_from_fork+0x35/0x40

    fsstress D 0 5277 5266 0x00084000
    Call Trace:
    schedule+0x4a/0xb0
    rwsem_down_write_slowpath+0x29d/0x540
    block_operations+0x105/0x360 [f2fs]
    f2fs_write_checkpoint+0x101/0x1010 [f2fs]
    f2fs_sync_fs+0xa8/0x130 [f2fs]
    f2fs_do_sync_file+0x1ad/0x890 [f2fs]
    do_fsync+0x38/0x60
    __x64_sys_fdatasync+0x13/0x20
    do_syscall_64+0x5f/0x220
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

    The root cause is there is potential deadlock between quota data
    update and writeback.

    Kworker Thread B Thread C
    - f2fs_write_cache_pages
    - lock whole cluster --- A
    - f2fs_write_multi_pages
    - f2fs_write_raw_pages
    - f2fs_write_single_data_page
    - f2fs_do_write_data_page
    - f2fs_setattr
    - f2fs_lock_op --- B
    - f2fs_write_checkpoint
    - block_operations
    - f2fs_lock_all --- B
    - dquot_transfer
    - f2fs_quota_write
    - f2fs_prepare_compress_overwrite
    - pagecache_get_page --- A
    - f2fs_trylock_op failed --- B
    - congestion_wait
    - goto rewrite

    To fix this issue, during quota file writeback, just redirty all pages
    left in cluster rather holding pages' lock in cluster and looping retrying
    lock cp_rwsem.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     

23 Mar, 2020

1 commit

  • por_fsstress reports inconsistent status in orphan inode, the root cause
    of this is in f2fs_write_raw_pages() we decrease i_compr_blocks incorrectly
    due to wrong calculation in f2fs_compressed_blocks().

    So this patch exposes below two functions based on __f2fs_cluster_blocks:
    - f2fs_compressed_blocks: get count of compressed blocks in compressed cluster
    - f2fs_cluster_blocks: get count of valid blocks (including reserved blocks)
    in compressed cluster.

    Then use f2fs_compress_blocks() to get correct compressed blocks count in
    f2fs_write_raw_pages().

    sanity_check_inode: inode (ino=ad80) hash inconsistent i_compr_blocks:2, i_blocks:1, run fsck to fix

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     

20 Mar, 2020

3 commits

  • If we are in write IO path, we need to avoid using GFP_KERNEL.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • As Geert Uytterhoeven reported:

    for parameter HZ/50 in congestion_wait(BLK_RW_ASYNC, HZ/50);

    On some platforms, HZ can be less than 50, then unexpected 0 timeout
    jiffies will be set in congestion_wait().

    This patch introduces a macro DEFAULT_IO_TIMEOUT to wrap a determinate
    value with msecs_to_jiffies(20) to instead HZ/50 to avoid such issue.

    Quoted from Geert Uytterhoeven:

    "A timeout of HZ means 1 second.
    HZ/50 means 20 ms, but has the risk of being zero, if HZ < 50.

    If you want to use a timeout of 20 ms, you best use msecs_to_jiffies(20),
    as that takes care of the special cases, and never returns 0."

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • - rename datablock_addr() to data_blkaddr().
    - wrap data_blkaddr() with f2fs_data_blkaddr() to clean up
    parameters.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     

11 Mar, 2020

2 commits

  • In compress cluster, if physical block number is less than logic
    page number, race condition will cause use-after-free issue as
    described below:

    - f2fs_write_compressed_pages
    - fio.page = cic->rpages[0];
    - f2fs_outplace_write_data
    - f2fs_compress_write_end_io
    - kfree(cic->rpages);
    - kfree(cic);
    - fio.page = cic->rpages[1];

    f2fs_write_multi_pages+0xfd0/0x1a98
    f2fs_write_data_pages+0x74c/0xb5c
    do_writepages+0x64/0x108
    __writeback_single_inode+0xdc/0x4b8
    writeback_sb_inodes+0x4d0/0xa68
    __writeback_inodes_wb+0x88/0x178
    wb_writeback+0x1f0/0x424
    wb_workfn+0x2f4/0x574
    process_one_work+0x210/0x48c
    worker_thread+0x2e8/0x44c
    kthread+0x110/0x120
    ret_from_fork+0x10/0x18

    Fixes: 4c8ff7095bef ("f2fs: support data compression")
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • This change solves below hangtask issue:

    INFO: task kworker/u16:1:58 blocked for more than 122 seconds.
    Not tainted 5.6.0-rc2-00590-g9983bdae4974e #11
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    kworker/u16:1 D 0 58 2 0x00000000
    Workqueue: writeback wb_workfn (flush-179:0)
    Backtrace:
    (__schedule) from [] (schedule+0x78/0xf4)
    (schedule) from [] (rwsem_down_write_slowpath+0x24c/0x4c0)
    (rwsem_down_write_slowpath) from [] (down_write+0x6c/0x70)
    (down_write) from [] (f2fs_write_single_data_page+0x608/0x7ac)
    (f2fs_write_single_data_page) from [] (f2fs_write_cache_pages+0x2b4/0x7c4)
    (f2fs_write_cache_pages) from [] (f2fs_write_data_pages+0x344/0x35c)
    (f2fs_write_data_pages) from [] (do_writepages+0x3c/0xd4)
    (do_writepages) from [] (__writeback_single_inode+0x44/0x454)
    (__writeback_single_inode) from [] (writeback_sb_inodes+0x204/0x4b0)
    (writeback_sb_inodes) from [] (__writeback_inodes_wb+0x50/0xe4)
    (__writeback_inodes_wb) from [] (wb_writeback+0x294/0x338)
    (wb_writeback) from [] (wb_workfn+0x35c/0x54c)
    (wb_workfn) from [] (process_one_work+0x214/0x544)
    (process_one_work) from [] (worker_thread+0x4c/0x574)
    (worker_thread) from [] (kthread+0x144/0x170)
    (kthread) from [] (ret_from_fork+0x14/0x2c)

    Reported-and-tested-by: Ondřej Jirman
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     

28 Feb, 2020

4 commits

  • Using f2fs_trylock_op() in f2fs_write_compressed_pages() to avoid potential
    deadlock like we did in f2fs_write_single_data_page().

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • In Struct compress_data, chksum field was never used, remove it.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • Unable to handle kernel NULL pointer dereference at virtual address 00000000
    PC is at f2fs_free_dic+0x60/0x2c8
    LR is at f2fs_decompress_pages+0x3c4/0x3e8
    f2fs_free_dic+0x60/0x2c8
    f2fs_decompress_pages+0x3c4/0x3e8
    __read_end_io+0x78/0x19c
    f2fs_post_read_work+0x6c/0x94
    process_one_work+0x210/0x48c
    worker_thread+0x2e8/0x44c
    kthread+0x110/0x120
    ret_from_fork+0x10/0x18

    In f2fs_free_dic(), we can not use f2fs_put_page(,1) to release dic->tpages[i],
    as the page's mapping is NULL.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu
     
  • When the compressed data of a cluster doesn't end on a page boundary,
    the remainder of the last page must be zeroed in order to avoid leaking
    uninitialized memory to disk.

    Fixes: 4c8ff7095bef ("f2fs: support data compression")
    Signed-off-by: Eric Biggers
    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Eric Biggers
     

18 Jan, 2020

1 commit

  • This patch tries to support compression in f2fs.

    - New term named cluster is defined as basic unit of compression, file can
    be divided into multiple clusters logically. One cluster includes 4 << n
    (n >= 0) logical pages, compression size is also cluster size, each of
    cluster can be compressed or not.

    - In cluster metadata layout, one special flag is used to indicate cluster
    is compressed one or normal one, for compressed cluster, following metadata
    maps cluster to [1, 4 << n - 1] physical blocks, in where f2fs stores
    data including compress header and compressed data.

    - In order to eliminate write amplification during overwrite, F2FS only
    support compression on write-once file, data can be compressed only when
    all logical blocks in file are valid and cluster compress ratio is lower
    than specified threshold.

    - To enable compression on regular inode, there are three ways:
    * chattr +c file
    * chattr +c dir; touch dir/file
    * mount w/ -o compress_extension=ext; touch file.ext

    Compress metadata layout:
    [Dnode Structure]
    +-----------------------------------------------+
    | cluster 1 | cluster 2 | ......... | cluster N |
    +-----------------------------------------------+
    . . . .
    . . . .
    . Compressed Cluster . . Normal Cluster .
    +----------+---------+---------+---------+ +---------+---------+---------+---------+
    |compr flag| block 1 | block 2 | block 3 | | block 1 | block 2 | block 3 | block 4 |
    +----------+---------+---------+---------+ +---------+---------+---------+---------+
    . .
    . .
    . .
    +-------------+-------------+----------+----------------------------+
    | data length | data chksum | reserved | compressed data |
    +-------------+-------------+----------+----------------------------+

    Changelog:

    20190326:
    - fix error handling of read_end_io().
    - remove unneeded comments in f2fs_encrypt_one_page().

    20190327:
    - fix wrong use of f2fs_cluster_is_full() in f2fs_mpage_readpages().
    - don't jump into loop directly to avoid uninitialized variables.
    - add TODO tag in error path of f2fs_write_cache_pages().

    20190328:
    - fix wrong merge condition in f2fs_read_multi_pages().
    - check compressed file in f2fs_post_read_required().

    20190401
    - allow overwrite on non-compressed cluster.
    - check cluster meta before writing compressed data.

    20190402
    - don't preallocate blocks for compressed file.

    - add lz4 compress algorithm
    - process multiple post read works in one workqueue
    Now f2fs supports processing post read work in multiple workqueue,
    it shows low performance due to schedule overhead of multiple
    workqueue executing orderly.

    20190921
    - compress: support buffered overwrite
    C: compress cluster flag
    V: valid block address
    N: NEW_ADDR

    One cluster contain 4 blocks

    before overwrite after overwrite

    - VVVV -> CVNN
    - CVNN -> VVVV

    - CVNN -> CVNN
    - CVNN -> CVVV

    - CVVV -> CVNN
    - CVVV -> CVVV

    20191029
    - add kconfig F2FS_FS_COMPRESSION to isolate compression related
    codes, add kconfig F2FS_FS_{LZO,LZ4} to cover backend algorithm.
    note that: will remove lzo backend if Jaegeuk agreed that too.
    - update codes according to Eric's comments.

    20191101
    - apply fixes from Jaegeuk

    20191113
    - apply fixes from Jaegeuk
    - split workqueue for fsverity

    20191216
    - apply fixes from Jaegeuk

    20200117
    - fix to avoid NULL pointer dereference

    [Jaegeuk Kim]
    - add tracepoint for f2fs_{,de}compress_pages()
    - fix many bugs and add some compression stats
    - fix overwrite/mmap bugs
    - address 32bit build error, reported by Geert.
    - bug fixes when handling errors and i_compressed_blocks

    Reported-by:
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim

    Chao Yu