25 Aug, 2022

2 commits

  • [ Upstream commit 09beadf289d6e300553e60d6e76f13c0427ecab3 ]

    As Wenqing Liu reported in bugzilla:

    https://bugzilla.kernel.org/show_bug.cgi?id=216285

    RIP: 0010:memcpy_erms+0x6/0x10
    f2fs_update_meta_page+0x84/0x570 [f2fs]
    change_curseg.constprop.0+0x159/0xbd0 [f2fs]
    f2fs_do_replace_block+0x5c7/0x18a0 [f2fs]
    f2fs_replace_block+0xeb/0x180 [f2fs]
    recover_data+0x1abd/0x6f50 [f2fs]
    f2fs_recover_fsync_data+0x12ce/0x3250 [f2fs]
    f2fs_fill_super+0x4459/0x6190 [f2fs]
    mount_bdev+0x2cf/0x3b0
    legacy_get_tree+0xed/0x1d0
    vfs_get_tree+0x81/0x2b0
    path_mount+0x47e/0x19d0
    do_mount+0xce/0xf0
    __x64_sys_mount+0x12c/0x1a0
    do_syscall_64+0x38/0x90
    entry_SYSCALL_64_after_hwframe+0x63/0xcd

    The root cause is segment type is invalid, so in f2fs_do_replace_block(),
    f2fs accesses f2fs_sm_info::curseg_array with out-of-range segment type,
    result in accessing invalid curseg->sum_blk during memcpy in
    f2fs_update_meta_page(). Fix this by adding sanity check on segment type
    in build_sit_entries().

    Reported-by: Wenqing Liu
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Sasha Levin

    Chao Yu
     
  • [ Upstream commit 141170b759e03958f296033bb7001be62d1d363b ]

    As Dipanjan Das reported, syzkaller
    found a f2fs bug as below:

    RIP: 0010:f2fs_new_node_page+0x19ac/0x1fc0 fs/f2fs/node.c:1295
    Call Trace:
    write_all_xattrs fs/f2fs/xattr.c:487 [inline]
    __f2fs_setxattr+0xe76/0x2e10 fs/f2fs/xattr.c:743
    f2fs_setxattr+0x233/0xab0 fs/f2fs/xattr.c:790
    f2fs_xattr_generic_set+0x133/0x170 fs/f2fs/xattr.c:86
    __vfs_setxattr+0x115/0x180 fs/xattr.c:182
    __vfs_setxattr_noperm+0x125/0x5f0 fs/xattr.c:216
    __vfs_setxattr_locked+0x1cf/0x260 fs/xattr.c:277
    vfs_setxattr+0x13f/0x330 fs/xattr.c:303
    setxattr+0x146/0x160 fs/xattr.c:611
    path_setxattr+0x1a7/0x1d0 fs/xattr.c:630
    __do_sys_lsetxattr fs/xattr.c:653 [inline]
    __se_sys_lsetxattr fs/xattr.c:649 [inline]
    __x64_sys_lsetxattr+0xbd/0x150 fs/xattr.c:649
    do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
    entry_SYSCALL_64_after_hwframe+0x46/0xb0

    NAT entry and nat bitmap can be inconsistent, e.g. one nid is free
    in nat bitmap, and blkaddr in its NAT entry is not NULL_ADDR, it
    may trigger BUG_ON() in f2fs_new_node_page(), fix it.

    Reported-by: Dipanjan Das
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Sasha Levin

    Chao Yu
     

17 Aug, 2022

3 commits

  • [ Upstream commit 90be48bd9d29ece3965e5e8b21499b6db166e57b ]

    If a file has FI_COMPRESS_RELEASED, all writes for it should not be
    allowed. However, as of now, in case of compress_mode=user, writes
    triggered by IOCTLs like F2FS_IOC_DE/COMPRESS_FILE are allowed unexpectly,
    which could crash that file.
    To fix it, let's do not allow F2FS_IOC_DE/COMPRESS_IOCTL if a file already
    has FI_COMPRESS_RELEASED flag.

    This is the reproduction process:
    1. $ touch ./file
    2. $ chattr +c ./file
    3. $ dd if=/dev/random of=./file bs=4096 count=30 conv=notrunc
    4. $ dd if=/dev/zero of=./file bs=4096 count=34 seek=30 conv=notrunc
    5. $ sync
    6. $ do_compress ./file ; call F2FS_IOC_COMPRESS_FILE
    7. $ get_compr_blocks ./file ; call F2FS_IOC_GET_COMPRESS_BLOCKS
    8. $ release ./file ; call F2FS_IOC_RELEASE_COMPRESS_BLOCKS
    9. $ do_compress ./file ; call F2FS_IOC_COMPRESS_FILE again
    10. $ get_compr_blocks ./file ; call F2FS_IOC_GET_COMPRESS_BLOCKS again

    This reproduction process is tested in 128kb cluster size.
    You can find compr_blocks has a negative value.

    Fixes: 5fdb322ff2c2b ("f2fs: add F2FS_IOC_DECOMPRESS_FILE and F2FS_IOC_COMPRESS_FILE")

    Signed-off-by: Junbeom Yeom
    Signed-off-by: Sungjong Seo
    Signed-off-by: Youngjin Gil
    Signed-off-by: Jaewook Kim
    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Sasha Levin

    Jaewook Kim
     
  • [ Upstream commit 66d34fcbbe63ebd8584b792e0d741f6648100894 ]

    Since commit e3c548323d32 ("f2fs: let's allow compression for mmap files"),
    it has been allowed to compress mmap files. However, in compress_mode=user,
    it is not allowed yet. To keep the same concept in both compress_modes,
    f2fs_ioc_(de)compress_file() should also allow it.

    Let's remove checking mmap files in f2fs_ioc_(de)compress_file() so that
    the compression for mmap files is also allowed in compress_mode=user.

    Signed-off-by: Sungjong Seo
    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Sasha Levin

    Sungjong Seo
     
  • [ Upstream commit 8ee236dcaa690d09ca612622e8bc8d09c302021d ]

    If the inode has the compress flag, it will fail to use
    'chattr -c +m' to remove its compress flag and tag no compress flag.
    However, the same command will be successful when executed again,
    as shown below:

    $ touch foo.txt
    $ chattr +c foo.txt
    $ chattr -c +m foo.txt
    chattr: Invalid argument while setting flags on foo.txt
    $ chattr -c +m foo.txt
    $ f2fs_io getflags foo.txt
    get a flag on foo.txt ret=0, flags=nocompression,inline_data

    Fix this by removing some checks in f2fs_setflags_common()
    that do not affect the original logic. I go through all the
    possible scenarios, and the results are as follows. Bold is
    the only thing that has changed.

    +---------------+-----------+-----------+----------+
    | | file flags |
    + command +-----------+-----------+----------+
    | | no flag | compr | nocompr |
    +---------------+-----------+-----------+----------+
    | chattr +c | compr | compr | -EINVAL |
    | chattr -c | no flag | no flag | nocompr |
    | chattr +m | nocompr | -EINVAL | nocompr |
    | chattr -m | no flag | compr | no flag |
    | chattr +c +m | -EINVAL | -EINVAL | -EINVAL |
    | chattr +c -m | compr | compr | compr |
    | chattr -c +m | nocompr | *nocompr* | nocompr |
    | chattr -c -m | no flag | no flag | no flag |
    +---------------+-----------+-----------+----------+

    Link: https://lore.kernel.org/linux-f2fs-devel/20220621064833.1079383-1-chaoliu719@gmail.com/
    Fixes: 4c8ff7095bef ("f2fs: support data compression")
    Reviewed-by: Chao Yu
    Signed-off-by: Chao Liu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Sasha Levin

    Chao Liu
     

29 Jun, 2022

1 commit

  • commit 4cde00d50707c2ef6647b9b96b2cb40b6eb24397 upstream.

    This fixes the below corruption.

    [345393.335389] F2FS-fs (vdb): sanity_check_inode: inode (ino=6d0, mode=33206) should not have inline_data, run fsck to fix

    Cc:
    Fixes: 677a82b44ebf ("f2fs: fix to do sanity check for inline inode")
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Jaegeuk Kim
     

15 Jun, 2022

2 commits

  • [ Upstream commit 2d1fe8a86bf5e0663866fd0da83c2af1e1b0e362 ]

    In order to garantee migrated data be persisted during checkpoint,
    otherwise out-of-order persistency between data and node may cause
    data corruption after SPOR.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Sasha Levin

    Chao Yu
     
  • [ Upstream commit dc2f78e2d4cc844a1458653d57ce1b54d4a29f21 ]

    Syzbot triggers two WARNs in f2fs_is_valid_blkaddr and
    __is_bitmap_valid. For example, in f2fs_is_valid_blkaddr,
    if type is DATA_GENERIC_ENHANCE or DATA_GENERIC_ENHANCE_READ,
    it invokes WARN_ON if blkaddr is not in the right range.
    The call trace is as follows:

    f2fs_get_node_info+0x45f/0x1070
    read_node_page+0x577/0x1190
    __get_node_page.part.0+0x9e/0x10e0
    __get_node_page
    f2fs_get_node_page+0x109/0x180
    do_read_inode
    f2fs_iget+0x2a5/0x58b0
    f2fs_fill_super+0x3b39/0x7ca0

    Fix these two WARNs by replacing WARN_ON with dump_stack.

    Reported-by: syzbot+763ae12a2ede1d99d4dc@syzkaller.appspotmail.com
    Signed-off-by: Dongliang Mu
    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Sasha Levin

    Dongliang Mu
     

09 Jun, 2022

12 commits

  • commit 677a82b44ebf263d4f9a0cfbd576a6ade797a07b upstream.

    Yanming reported a kernel bug in Bugzilla kernel [1], which can be
    reproduced. The bug message is:

    The kernel message is shown below:

    kernel BUG at fs/inode.c:611!
    Call Trace:
    evict+0x282/0x4e0
    __dentry_kill+0x2b2/0x4d0
    dput+0x2dd/0x720
    do_renameat2+0x596/0x970
    __x64_sys_rename+0x78/0x90
    do_syscall_64+0x3b/0x90

    [1] https://bugzilla.kernel.org/show_bug.cgi?id=215895

    The bug is due to fuzzed inode has both inline_data and encrypted flags.
    During f2fs_evict_inode(), as the inode was deleted by rename(), it
    will cause inline data conversion due to conflicting flags. The page
    cache will be polluted and the panic will be triggered in clear_inode().

    Try fixing the bug by doing more sanity checks for inline data inode in
    sanity_check_inode().

    Cc: stable@vger.kernel.org
    Reported-by: Ming Yan
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Chao Yu
     
  • commit 958ed92922028ec67f504dcdc72bfdfd0f43936a upstream.

    This patch tries to fix permission consistency issue as all other
    mainline filesystems.

    Since the initial introduction of (posix) fallocate back at the turn of
    the century, it has been possible to use this syscall to change the
    user-visible contents of files. This can happen by extending the file
    size during a preallocation, or through any of the newer modes (punch,
    zero, collapse, insert range). Because the call can be used to change
    file contents, we should treat it like we do any other modification to a
    file -- update the mtime, and drop set[ug]id privileges/capabilities.

    The VFS function file_modified() does all this for us if pass it a
    locked inode, so let's make fallocate drop permissions correctly.

    Cc: stable@kernel.org
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Chao Yu
     
  • commit b5639bb4313b9d455fc9fc4768d23a5e4ca8cb9d upstream.

    Tryng to rename a directory that has all following properties fails with
    EINVAL and triggers the 'WARN_ON_ONCE(!fscrypt_has_encryption_key(dir))'
    in f2fs_match_ci_name():

    - The directory is casefolded
    - The directory is encrypted
    - The directory's encryption key is not yet set up
    - The parent directory is *not* encrypted

    The problem is incorrect handling of the lookup of ".." to get the
    parent reference to update. fscrypt_setup_filename() treats ".." (and
    ".") specially, as it's never encrypted. It's passed through as-is, and
    setting up the directory's key is not attempted. As the name isn't a
    no-key name, f2fs treats it as a "normal" name and attempts a casefolded
    comparison. That breaks the assumption of the WARN_ON_ONCE() in
    f2fs_match_ci_name() which assumes that for encrypted directories,
    casefolded comparisons only happen when the directory's key is set up.

    We could just remove this WARN_ON_ONCE(). However, since casefolding is
    always a no-op on "." and ".." anyway, let's instead just not casefold
    these names. This results in the standard bytewise comparison.

    Fixes: 7ad08a58bf67 ("f2fs: Handle casefolding with Encryption")
    Cc: # v5.11+
    Signed-off-by: Eric Biggers
    Reviewed-by: Gabriel Krisman Bertazi
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Eric Biggers
     
  • commit 6b8beca0edd32075a769bfe4178ca00c0dcd22a9 upstream.

    As Yanming reported in bugzilla:

    https://bugzilla.kernel.org/show_bug.cgi?id=215916

    The kernel message is shown below:

    kernel BUG at fs/f2fs/segment.c:2560!
    Call Trace:
    allocate_segment_by_default+0x228/0x440
    f2fs_allocate_data_block+0x13d1/0x31f0
    do_write_page+0x18d/0x710
    f2fs_outplace_write_data+0x151/0x250
    f2fs_do_write_data_page+0xef9/0x1980
    move_data_page+0x6af/0xbc0
    do_garbage_collect+0x312f/0x46f0
    f2fs_gc+0x6b0/0x3bc0
    f2fs_balance_fs+0x921/0x2260
    f2fs_write_single_data_page+0x16be/0x2370
    f2fs_write_cache_pages+0x428/0xd00
    f2fs_write_data_pages+0x96e/0xd50
    do_writepages+0x168/0x550
    __writeback_single_inode+0x9f/0x870
    writeback_sb_inodes+0x47d/0xb20
    __writeback_inodes_wb+0xb2/0x200
    wb_writeback+0x4bd/0x660
    wb_workfn+0x5f3/0xab0
    process_one_work+0x79f/0x13e0
    worker_thread+0x89/0xf60
    kthread+0x26a/0x300
    ret_from_fork+0x22/0x30
    RIP: 0010:new_curseg+0xe8d/0x15f0

    The root cause is: ckpt.valid_block_count is inconsistent with SIT table,
    stat info indicates filesystem has free blocks, but SIT table indicates
    filesystem has no free segment.

    So that during garbage colloection, it triggers panic when LFS allocator
    fails to find free segment.

    This patch tries to fix this issue by checking consistency in between
    ckpt.valid_block_count and block accounted from SIT.

    Cc: stable@vger.kernel.org
    Reported-by: Ming Yan
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Chao Yu
     
  • commit 6213f5d4d23c50d393a31dc8e351e63a1fd10dbe upstream.

    Let's avoid false-alarmed lockdep warning.

    [ 58.914674] [T1501146] -> #2 (&sb->s_type->i_mutex_key#20){+.+.}-{3:3}:
    [ 58.915975] [T1501146] system_server: down_write+0x7c/0xe0
    [ 58.916738] [T1501146] system_server: f2fs_quota_sync+0x60/0x1a8
    [ 58.917563] [T1501146] system_server: block_operations+0x16c/0x43c
    [ 58.918410] [T1501146] system_server: f2fs_write_checkpoint+0x114/0x318
    [ 58.919312] [T1501146] system_server: f2fs_issue_checkpoint+0x178/0x21c
    [ 58.920214] [T1501146] system_server: f2fs_sync_fs+0x48/0x6c
    [ 58.920999] [T1501146] system_server: f2fs_do_sync_file+0x334/0x738
    [ 58.921862] [T1501146] system_server: f2fs_sync_file+0x30/0x48
    [ 58.922667] [T1501146] system_server: __arm64_sys_fsync+0x84/0xf8
    [ 58.923506] [T1501146] system_server: el0_svc_common.llvm.12821150825140585682+0xd8/0x20c
    [ 58.924604] [T1501146] system_server: do_el0_svc+0x28/0xa0
    [ 58.925366] [T1501146] system_server: el0_svc+0x24/0x38
    [ 58.926094] [T1501146] system_server: el0_sync_handler+0x88/0xec
    [ 58.926920] [T1501146] system_server: el0_sync+0x1b4/0x1c0

    [ 58.927681] [T1501146] -> #1 (&sbi->cp_global_sem){+.+.}-{3:3}:
    [ 58.928889] [T1501146] system_server: down_write+0x7c/0xe0
    [ 58.929650] [T1501146] system_server: f2fs_write_checkpoint+0xbc/0x318
    [ 58.930541] [T1501146] system_server: f2fs_issue_checkpoint+0x178/0x21c
    [ 58.931443] [T1501146] system_server: f2fs_sync_fs+0x48/0x6c
    [ 58.932226] [T1501146] system_server: sync_filesystem+0xac/0x130
    [ 58.933053] [T1501146] system_server: generic_shutdown_super+0x38/0x150
    [ 58.933958] [T1501146] system_server: kill_block_super+0x24/0x58
    [ 58.934791] [T1501146] system_server: kill_f2fs_super+0xcc/0x124
    [ 58.935618] [T1501146] system_server: deactivate_locked_super+0x90/0x120
    [ 58.936529] [T1501146] system_server: deactivate_super+0x74/0xac
    [ 58.937356] [T1501146] system_server: cleanup_mnt+0x128/0x168
    [ 58.938150] [T1501146] system_server: __cleanup_mnt+0x18/0x28
    [ 58.938944] [T1501146] system_server: task_work_run+0xb8/0x14c
    [ 58.939749] [T1501146] system_server: do_notify_resume+0x114/0x1e8
    [ 58.940595] [T1501146] system_server: work_pending+0xc/0x5f0

    [ 58.941375] [T1501146] -> #0 (&sbi->gc_lock){+.+.}-{3:3}:
    [ 58.942519] [T1501146] system_server: __lock_acquire+0x1270/0x2868
    [ 58.943366] [T1501146] system_server: lock_acquire+0x114/0x294
    [ 58.944169] [T1501146] system_server: down_write+0x7c/0xe0
    [ 58.944930] [T1501146] system_server: f2fs_issue_checkpoint+0x13c/0x21c
    [ 58.945831] [T1501146] system_server: f2fs_sync_fs+0x48/0x6c
    [ 58.946614] [T1501146] system_server: f2fs_do_sync_file+0x334/0x738
    [ 58.947472] [T1501146] system_server: f2fs_ioc_commit_atomic_write+0xc8/0x14c
    [ 58.948439] [T1501146] system_server: __f2fs_ioctl+0x674/0x154c
    [ 58.949253] [T1501146] system_server: f2fs_ioctl+0x54/0x88
    [ 58.950018] [T1501146] system_server: __arm64_sys_ioctl+0xa8/0x110
    [ 58.950865] [T1501146] system_server: el0_svc_common.llvm.12821150825140585682+0xd8/0x20c
    [ 58.951965] [T1501146] system_server: do_el0_svc+0x28/0xa0
    [ 58.952727] [T1501146] system_server: el0_svc+0x24/0x38
    [ 58.953454] [T1501146] system_server: el0_sync_handler+0x88/0xec
    [ 58.954279] [T1501146] system_server: el0_sync+0x1b4/0x1c0

    Cc: stable@vger.kernel.org
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Jaegeuk Kim
     
  • commit cfd66bb715fd11fde3338d0660cffa1396adc27d upstream.

    As Yanming reported in bugzilla:

    https://bugzilla.kernel.org/show_bug.cgi?id=215914

    The root cause is: in a very small sized image, it's very easy to
    exceed threshold of foreground GC, if we calculate free space and
    dirty data based on section granularity, in corner case,
    has_not_enough_free_secs() will always return true, result in
    deadloop in f2fs_gc().

    So this patch refactors has_not_enough_free_secs() as below to fix
    this issue:
    1. calculate needed space based on block granularity, and separate
    all blocks to two parts, section part, and block part, comparing
    section part to free section, and comparing block part to free space
    in openned log.
    2. account F2FS_DIRTY_NODES, F2FS_DIRTY_IMETA and F2FS_DIRTY_DENTS
    as node block consumer;
    3. account F2FS_DIRTY_DENTS as data block consumer;

    Cc: stable@vger.kernel.org
    Reported-by: Ming Yan
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Chao Yu
     
  • commit f2db71053dc0409fae785096ad19cce4c8a95af7 upstream.

    As Yanming reported in bugzilla:

    https://bugzilla.kernel.org/show_bug.cgi?id=215904

    The kernel message is shown below:

    kernel BUG at fs/f2fs/inode.c:825!
    Call Trace:
    evict+0x282/0x4e0
    __dentry_kill+0x2b2/0x4d0
    shrink_dentry_list+0x17c/0x4f0
    shrink_dcache_parent+0x143/0x1e0
    do_one_tree+0x9/0x30
    shrink_dcache_for_umount+0x51/0x120
    generic_shutdown_super+0x5c/0x3a0
    kill_block_super+0x90/0xd0
    kill_f2fs_super+0x225/0x310
    deactivate_locked_super+0x78/0xc0
    cleanup_mnt+0x2b7/0x480
    task_work_run+0xc8/0x150
    exit_to_user_mode_prepare+0x14a/0x150
    syscall_exit_to_user_mode+0x1d/0x40
    do_syscall_64+0x48/0x90

    The root cause is: inode node and dnode node share the same nid,
    so during f2fs_evict_inode(), dnode node truncation will invalidate
    its NAT entry, so when truncating inode node, it fails due to
    invalid NAT entry, result in inode is still marked as dirty, fix
    this issue by clearing dirty for inode and setting SBI_NEED_FSCK
    flag in filesystem.

    output from dump.f2fs:
    [print_node_info: 354] Node ID [0xf:15] is inode
    i_nid[0] [0x f : 15]

    Cc: stable@vger.kernel.org
    Reported-by: Ming Yan
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Chao Yu
     
  • commit 25f8236213a91efdf708b9d77e9e51b6fc3e141c upstream.

    As Yanming reported in bugzilla:

    https://bugzilla.kernel.org/show_bug.cgi?id=215894

    I have encountered a bug in F2FS file system in kernel v5.17.

    I have uploaded the system call sequence as case.c, and a fuzzed image can
    be found in google net disk

    The kernel should enable CONFIG_KASAN=y and CONFIG_KASAN_INLINE=y. You can
    reproduce the bug by running the following commands:

    kernel BUG at fs/f2fs/segment.c:2291!
    Call Trace:
    f2fs_invalidate_blocks+0x193/0x2d0
    f2fs_fallocate+0x2593/0x4a70
    vfs_fallocate+0x2a5/0xac0
    ksys_fallocate+0x35/0x70
    __x64_sys_fallocate+0x8e/0xf0
    do_syscall_64+0x3b/0x90
    entry_SYSCALL_64_after_hwframe+0x44/0xae

    The root cause is, after image was fuzzed, block mapping info in inode
    will be inconsistent with SIT table, so in f2fs_fallocate(), it will cause
    panic when updating SIT with invalid blkaddr.

    Let's fix the issue by adding sanity check on block address before updating
    SIT table with it.

    Cc: stable@vger.kernel.org
    Reported-by: Ming Yan
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Chao Yu
     
  • commit 4d17e6fe9293d57081ffdc11e1cf313e25e8fd9e upstream.

    As Yanming reported in bugzilla:

    https://bugzilla.kernel.org/show_bug.cgi?id=215897

    I have encountered a bug in F2FS file system in kernel v5.17.

    The kernel should enable CONFIG_KASAN=y and CONFIG_KASAN_INLINE=y. You can
    reproduce the bug by running the following commands:

    The kernel message is shown below:

    kernel BUG at fs/f2fs/f2fs.h:2511!
    Call Trace:
    f2fs_remove_inode_page+0x2a2/0x830
    f2fs_evict_inode+0x9b7/0x1510
    evict+0x282/0x4e0
    do_unlinkat+0x33a/0x540
    __x64_sys_unlinkat+0x8e/0xd0
    do_syscall_64+0x3b/0x90
    entry_SYSCALL_64_after_hwframe+0x44/0xae

    The root cause is: .total_valid_block_count or .total_valid_node_count
    could fuzzed to zero, then once dec_valid_node_count() was called, it
    will cause BUG_ON(), this patch fixes to print warning info and set
    SBI_NEED_FSCK into CP instead of panic.

    Cc: stable@vger.kernel.org
    Reported-by: Ming Yan
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Chao Yu
     
  • [ Upstream commit 2aaf51dd39afb6d01d13f1e6fe20b684733b37d5 ]

    The list iterator variable will be a bogus pointer if no break was hit.
    Dereferencing it (cur->page in this case) could load an out-of-bounds/undefined
    value making it unsafe to use that in the comparision to determine if the
    specific element was found.

    Since 'cur->page' *can* be out-ouf-bounds it cannot be guaranteed that
    by chance (or intention of an attacker) it matches the value of 'page'
    even though the correct element was not found.

    This is fixed by using a separate list iterator variable for the loop
    and only setting the original variable if a suitable element was found.
    Then determing if the element was found is simply checking if the
    variable is set.

    Fixes: 8c242db9b8c0 ("f2fs: fix stale ATOMIC_WRITTEN_PAGE private pointer")
    Signed-off-by: Jakob Koschel
    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Sasha Levin

    Jakob Koschel
     
  • [ Upstream commit 12662d19467b391b5b509ac5e9ab4f583c6dde16 ]

    As Wenqing reported in bugzilla:

    https://bugzilla.kernel.org/show_bug.cgi?id=215765

    It will cause a kernel panic with steps:
    - mkdir mnt
    - mount tmp40.img mnt
    - ls mnt

    folio_mark_dirty+0x33/0x50
    f2fs_add_regular_entry+0x541/0xad0 [f2fs]
    f2fs_add_dentry+0x6c/0xb0 [f2fs]
    f2fs_do_add_link+0x182/0x230 [f2fs]
    __recover_dot_dentries+0x2d6/0x470 [f2fs]
    f2fs_lookup+0x5af/0x6a0 [f2fs]
    __lookup_slow+0xac/0x200
    lookup_slow+0x45/0x70
    walk_component+0x16c/0x250
    path_lookupat+0x8b/0x1f0
    filename_lookup+0xef/0x250
    user_path_at_empty+0x46/0x70
    vfs_statx+0x98/0x190
    __do_sys_newlstat+0x41/0x90
    __x64_sys_newlstat+0x1a/0x30
    do_syscall_64+0x37/0xb0
    entry_SYSCALL_64_after_hwframe+0x44/0xae

    The root cause is for special file: e.g. character, block, fifo or
    socket file, f2fs doesn't assign address space operations pointer array
    for mapping->a_ops field, so, in a fuzzed image, if inline_dots flag was
    tagged in special file, during lookup(), when f2fs runs into
    __recover_dot_dentries(), it will cause NULL pointer access once
    f2fs_add_regular_entry() calls a_ops->set_dirty_page().

    Fixes: 510022a85839 ("f2fs: add F2FS_INLINE_DOTS to recover missing dot dentries")
    Reported-by: Wenqing Liu
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Sasha Levin

    Chao Yu
     
  • [ Upstream commit 10a26878564f27327b12e8f4b4d8d7b43065fae5 ]

    This patch adds a new function f2fs_dquot_initialize() to wrap
    dquot_initialize(), and it supports to inject fault into
    f2fs_dquot_initialize() to simulate inner failure occurs in
    dquot_initialize().

    Usage:
    a) echo 65536 > /sys/fs/f2fs//inject_type or
    b) mount -o fault_type=65536

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Sasha Levin

    Chao Yu
     

01 May, 2022

1 commit

  • commit a6294593e8a1290091d0b078d5d33da5e0cd3dfe upstream

    Turn iov_iter_fault_in_readable into a function that returns the number
    of bytes not faulted in, similar to copy_to_user, instead of returning a
    non-zero value when any of the requested pages couldn't be faulted in.
    This supports the existing users that require all pages to be faulted in
    as well as new users that are happy if any pages can be faulted in.

    Rename iov_iter_fault_in_readable to fault_in_iov_iter_readable to make
    sure this change doesn't silently break things.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Anand Jain
    Signed-off-by: Greg Kroah-Hartman

    Andreas Gruenbacher
     

08 Apr, 2022

11 commits

  • [ Upstream commit d284af43f703760e261b1601378a0c13a19d5f1f ]

    In lz4_decompress_pages(), if size of decompressed data is not equal to
    expected one, we should print the size rather than size of target buffer
    for decompressed data, fix it.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Sasha Levin

    Chao Yu
     
  • [ Upstream commit 98237fcda4a24e67b0a4498c17d5aa4ad4537bc7 ]

    [14696.634553] task:cat state:D stack: 0 pid:1613738 ppid:1613735 flags:0x00000004
    [14696.638285] Call Trace:
    [14696.639038]
    [14696.640032] __schedule+0x302/0x930
    [14696.640969] schedule+0x58/0xd0
    [14696.641799] schedule_preempt_disabled+0x18/0x30
    [14696.642890] __mutex_lock.constprop.0+0x2fb/0x4f0
    [14696.644035] ? mod_objcg_state+0x10c/0x310
    [14696.645040] ? obj_cgroup_charge+0xe1/0x170
    [14696.646067] __mutex_lock_slowpath+0x13/0x20
    [14696.647126] mutex_lock+0x34/0x40
    [14696.648070] stat_show+0x25/0x17c0 [f2fs]
    [14696.649218] seq_read_iter+0x120/0x4b0
    [14696.650289] ? aa_file_perm+0x12a/0x500
    [14696.651357] ? lru_cache_add+0x1c/0x20
    [14696.652470] seq_read+0xfd/0x140
    [14696.653445] full_proxy_read+0x5c/0x80
    [14696.654535] vfs_read+0xa0/0x1a0
    [14696.655497] ksys_read+0x67/0xe0
    [14696.656502] __x64_sys_read+0x1a/0x20
    [14696.657580] do_syscall_64+0x3b/0xc0
    [14696.658671] entry_SYSCALL_64_after_hwframe+0x44/0xae
    [14696.660068] RIP: 0033:0x7efe39df1cb2
    [14696.661133] RSP: 002b:00007ffc8badd948 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
    [14696.662958] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007efe39df1cb2
    [14696.664757] RDX: 0000000000020000 RSI: 00007efe399df000 RDI: 0000000000000003
    [14696.666542] RBP: 00007efe399df000 R08: 00007efe399de010 R09: 00007efe399de010
    [14696.668363] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000000000
    [14696.670155] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
    [14696.671965]
    [14696.672826] task:umount state:D stack: 0 pid:1614985 ppid:1614984 flags:0x00004000
    [14696.674930] Call Trace:
    [14696.675903]
    [14696.676780] __schedule+0x302/0x930
    [14696.677927] schedule+0x58/0xd0
    [14696.679019] schedule_preempt_disabled+0x18/0x30
    [14696.680412] __mutex_lock.constprop.0+0x2fb/0x4f0
    [14696.681783] ? destroy_inode+0x65/0x80
    [14696.683006] __mutex_lock_slowpath+0x13/0x20
    [14696.684305] mutex_lock+0x34/0x40
    [14696.685442] f2fs_destroy_stats+0x1e/0x60 [f2fs]
    [14696.686803] f2fs_put_super+0x158/0x390 [f2fs]
    [14696.688238] generic_shutdown_super+0x7a/0x120
    [14696.689621] kill_block_super+0x27/0x50
    [14696.690894] kill_f2fs_super+0x7f/0x100 [f2fs]
    [14696.692311] deactivate_locked_super+0x35/0xa0
    [14696.693698] deactivate_super+0x40/0x50
    [14696.694985] cleanup_mnt+0x139/0x190
    [14696.696209] __cleanup_mnt+0x12/0x20
    [14696.697390] task_work_run+0x64/0xa0
    [14696.698587] exit_to_user_mode_prepare+0x1b7/0x1c0
    [14696.700053] syscall_exit_to_user_mode+0x27/0x50
    [14696.701418] do_syscall_64+0x48/0xc0
    [14696.702630] entry_SYSCALL_64_after_hwframe+0x44/0xae

    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Sasha Levin

    Jaegeuk Kim
     
  • [ Upstream commit ba900534f807f0b327c92d5141c85d2313e2d55c ]

    Let's purge inode cache in order to avoid the below deadlock.

    [freeze test] shrinkder
    freeze_super
    - pwercpu_down_write(SB_FREEZE_FS)
    - super_cache_scan
    - down_read(&sb->s_umount)
    - prune_icache_sb
    - dispose_list
    - evict
    - f2fs_evict_inode
    thaw_super
    - down_write(&sb->s_umount);
    - __percpu_down_read(SB_FREEZE_FS)

    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Sasha Levin

    Jaegeuk Kim
     
  • [ Upstream commit f41ee8b91c00770d718be2ff4852a80017ae9ab3 ]

    As Wenqing Liu reported in bugzilla:

    https://bugzilla.kernel.org/show_bug.cgi?id=215657

    - Overview
    UBSAN: array-index-out-of-bounds in fs/f2fs/segment.c:3460:2 when mount and operate a corrupted image

    - Reproduce
    tested on kernel 5.17-rc4, 5.17-rc6

    1. mkdir test_crash
    2. cd test_crash
    3. unzip tmp2.zip
    4. mkdir mnt
    5. ./single_test.sh f2fs 2

    - Kernel dump
    [ 46.434454] loop0: detected capacity change from 0 to 131072
    [ 46.529839] F2FS-fs (loop0): Mounted with checkpoint version = 7548c2d9
    [ 46.738319] ================================================================================
    [ 46.738412] UBSAN: array-index-out-of-bounds in fs/f2fs/segment.c:3460:2
    [ 46.738475] index 231 is out of range for type 'unsigned int [2]'
    [ 46.738539] CPU: 2 PID: 939 Comm: umount Not tainted 5.17.0-rc6 #1
    [ 46.738547] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014
    [ 46.738551] Call Trace:
    [ 46.738556]
    [ 46.738563] dump_stack_lvl+0x47/0x5c
    [ 46.738581] ubsan_epilogue+0x5/0x50
    [ 46.738592] __ubsan_handle_out_of_bounds+0x68/0x80
    [ 46.738604] f2fs_allocate_data_block+0xdff/0xe60 [f2fs]
    [ 46.738819] do_write_page+0xef/0x210 [f2fs]
    [ 46.738934] f2fs_do_write_node_page+0x3f/0x80 [f2fs]
    [ 46.739038] __write_node_page+0x2b7/0x920 [f2fs]
    [ 46.739162] f2fs_sync_node_pages+0x943/0xb00 [f2fs]
    [ 46.739293] f2fs_write_checkpoint+0x7bb/0x1030 [f2fs]
    [ 46.739405] kill_f2fs_super+0x125/0x150 [f2fs]
    [ 46.739507] deactivate_locked_super+0x60/0xc0
    [ 46.739517] deactivate_super+0x70/0xb0
    [ 46.739524] cleanup_mnt+0x11a/0x200
    [ 46.739532] __cleanup_mnt+0x16/0x20
    [ 46.739538] task_work_run+0x67/0xa0
    [ 46.739547] exit_to_user_mode_prepare+0x18c/0x1a0
    [ 46.739559] syscall_exit_to_user_mode+0x26/0x40
    [ 46.739568] do_syscall_64+0x46/0xb0
    [ 46.739584] entry_SYSCALL_64_after_hwframe+0x44/0xae

    The root cause is we missed to do sanity check on curseg->alloc_type,
    result in out-of-bound accessing on sbi->block_count[] array, fix it.

    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Sasha Levin

    Chao Yu
     
  • [ Upstream commit 9b56adcf525522e9ffa52471260298d91fc1d395 ]

    When compressed file has blocks, f2fs_ioc_start_atomic_write will succeed,
    but compressed flag will be remained in inode. If write partial compreseed
    cluster and commit atomic write will cause data corruption.

    This is the reproduction process:
    Step 1:
    create a compressed file ,write 64K data , call fsync(), then the blocks
    are write as compressed cluster.
    Step2:
    iotcl(F2FS_IOC_START_ATOMIC_WRITE) --- this should be fail, but not.
    write page 0 and page 3.
    iotcl(F2FS_IOC_COMMIT_ATOMIC_WRITE) -- page 0 and 3 write as normal file,
    Step3:
    drop cache.
    read page 0-4 -- Since page 0 has a valid block address, read as
    non-compressed cluster, page 1 and 2 will be filled with compressed data
    or zero.

    The root cause is, after commit 7eab7a696827 ("f2fs: compress: remove
    unneeded read when rewrite whole cluster"), in step 2, f2fs_write_begin()
    only set target page dirty, and in f2fs_commit_inmem_pages(), we will write
    partial raw pages into compressed cluster, result in corrupting compressed
    cluster layout.

    Fixes: 4c8ff7095bef ("f2fs: support data compression")
    Fixes: 7eab7a696827 ("f2fs: compress: remove unneeded read when rewrite whole cluster")
    Reported-by: kernel test robot
    Reported-by: Dan Carpenter
    Signed-off-by: Fengnan Chang
    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Sasha Levin

    Fengnan Chang
     
  • [ Upstream commit 344150999b7fc88502a65bbb147a47503eca2033 ]

    Quoted from Jing Xia's report, there is a potential deadlock may happen
    between kworker and checkpoint as below:

    [T:writeback] [T:checkpoint]
    - wb_writeback
    - blk_start_plug
    bio contains NodeA was plugged in writeback threads
    - do_writepages -- sync write inodeB, inc wb_sync_req[DATA]
    - f2fs_write_data_pages
    - f2fs_write_single_data_page -- write last dirty page
    - f2fs_do_write_data_page
    - set_page_writeback -- clear page dirty flag and
    PAGECACHE_TAG_DIRTY tag in radix tree
    - f2fs_outplace_write_data
    - f2fs_update_data_blkaddr
    - f2fs_wait_on_page_writeback -- wait NodeA to writeback here
    - inode_dec_dirty_pages
    - writeback_sb_inodes
    - writeback_single_inode
    - do_writepages
    - f2fs_write_data_pages -- skip writepages due to wb_sync_req[DATA]
    - wbc->pages_skipped += get_dirty_pages() -- PAGECACHE_TAG_DIRTY is not set but get_dirty_pages() returns one
    - requeue_inode -- requeue inode to wb->b_dirty queue due to non-zero.pages_skipped
    - blk_finish_plug

    Let's try to avoid deadlock condition by forcing unplugging previous bio via
    blk_finish_plug(current->plug) once we'v skipped writeback in writepages()
    due to valid sbi->wb_sync_req[DATA/NODE].

    Fixes: 687de7f1010c ("f2fs: avoid IO split due to mixed WB_SYNC_ALL and WB_SYNC_NONE")
    Signed-off-by: Zhiguo Niu
    Signed-off-by: Jing Xia
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Sasha Levin

    Chao Yu
     
  • [ Upstream commit 2fef99b8372c1ae3d8445ab570e888b5a358dbe9 ]

    This patch fixes xfstests/generic/475 failure.

    [ 293.680694] F2FS-fs (dm-1): May loss orphan inode, run fsck to fix.
    [ 293.685358] Buffer I/O error on dev dm-1, logical block 8388592, async page read
    [ 293.691527] Buffer I/O error on dev dm-1, logical block 8388592, async page read
    [ 293.691764] sh (7615): drop_caches: 3
    [ 293.691819] sh (7616): drop_caches: 3
    [ 293.694017] Buffer I/O error on dev dm-1, logical block 1, async page read
    [ 293.695659] sh (7618): drop_caches: 3
    [ 293.696979] sh (7617): drop_caches: 3
    [ 293.700290] sh (7623): drop_caches: 3
    [ 293.708621] sh (7626): drop_caches: 3
    [ 293.711386] sh (7628): drop_caches: 3
    [ 293.711825] sh (7627): drop_caches: 3
    [ 293.716738] sh (7630): drop_caches: 3
    [ 293.719613] sh (7632): drop_caches: 3
    [ 293.720971] sh (7633): drop_caches: 3
    [ 293.727741] sh (7634): drop_caches: 3
    [ 293.730783] sh (7636): drop_caches: 3
    [ 293.732681] sh (7635): drop_caches: 3
    [ 293.732988] sh (7637): drop_caches: 3
    [ 293.738836] sh (7639): drop_caches: 3
    [ 293.740568] sh (7641): drop_caches: 3
    [ 293.743053] sh (7640): drop_caches: 3
    [ 293.821889] ------------[ cut here ]------------
    [ 293.824654] kernel BUG at fs/f2fs/node.c:3334!
    [ 293.826226] invalid opcode: 0000 [#1] PREEMPT SMP PTI
    [ 293.828713] CPU: 0 PID: 7653 Comm: umount Tainted: G OE 5.17.0-rc1-custom #1
    [ 293.830946] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
    [ 293.832526] RIP: 0010:f2fs_destroy_node_manager+0x33f/0x350 [f2fs]
    [ 293.833905] Code: e8 d6 3d f9 f9 48 8b 45 d0 65 48 2b 04 25 28 00 00 00 75 1a 48 81 c4 28 03 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b
    [ 293.837783] RSP: 0018:ffffb04ec31e7a20 EFLAGS: 00010202
    [ 293.839062] RAX: 0000000000000001 RBX: ffff9df947db2eb8 RCX: 0000000080aa0072
    [ 293.840666] RDX: 0000000000000000 RSI: ffffe86c0432a140 RDI: ffffffffc0b72a21
    [ 293.842261] RBP: ffffb04ec31e7d70 R08: ffff9df94ca85780 R09: 0000000080aa0072
    [ 293.843909] R10: ffff9df94ca85700 R11: ffff9df94e1ccf58 R12: ffff9df947db2e00
    [ 293.845594] R13: ffff9df947db2ed0 R14: ffff9df947db2eb8 R15: ffff9df947db2eb8
    [ 293.847855] FS: 00007f5a97379800(0000) GS:ffff9dfa77c00000(0000) knlGS:0000000000000000
    [ 293.850647] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 293.852940] CR2: 00007f5a97528730 CR3: 000000010bc76005 CR4: 0000000000370ef0
    [ 293.854680] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 293.856423] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [ 293.858380] Call Trace:
    [ 293.859302]
    [ 293.860311] ? ttwu_do_wakeup+0x1c/0x170
    [ 293.861800] ? ttwu_do_activate+0x6d/0xb0
    [ 293.863057] ? _raw_spin_unlock_irqrestore+0x29/0x40
    [ 293.864411] ? try_to_wake_up+0x9d/0x5e0
    [ 293.865618] ? debug_smp_processor_id+0x17/0x20
    [ 293.866934] ? debug_smp_processor_id+0x17/0x20
    [ 293.868223] ? free_unref_page+0xbf/0x120
    [ 293.869470] ? __free_slab+0xcb/0x1c0
    [ 293.870614] ? preempt_count_add+0x7a/0xc0
    [ 293.871811] ? __slab_free+0xa0/0x2d0
    [ 293.872918] ? __wake_up_common_lock+0x8a/0xc0
    [ 293.874186] ? __slab_free+0xa0/0x2d0
    [ 293.875305] ? free_inode_nonrcu+0x20/0x20
    [ 293.876466] ? free_inode_nonrcu+0x20/0x20
    [ 293.877650] ? debug_smp_processor_id+0x17/0x20
    [ 293.878949] ? call_rcu+0x11a/0x240
    [ 293.880060] ? f2fs_destroy_stats+0x59/0x60 [f2fs]
    [ 293.881437] ? kfree+0x1fe/0x230
    [ 293.882674] f2fs_put_super+0x160/0x390 [f2fs]
    [ 293.883978] generic_shutdown_super+0x7a/0x120
    [ 293.885274] kill_block_super+0x27/0x50
    [ 293.886496] kill_f2fs_super+0x7f/0x100 [f2fs]
    [ 293.887806] deactivate_locked_super+0x35/0xa0
    [ 293.889271] deactivate_super+0x40/0x50
    [ 293.890513] cleanup_mnt+0x139/0x190
    [ 293.891689] __cleanup_mnt+0x12/0x20
    [ 293.892850] task_work_run+0x64/0xa0
    [ 293.894035] exit_to_user_mode_prepare+0x1b7/0x1c0
    [ 293.895409] syscall_exit_to_user_mode+0x27/0x50
    [ 293.896872] do_syscall_64+0x48/0xc0
    [ 293.898090] entry_SYSCALL_64_after_hwframe+0x44/0xae
    [ 293.899517] RIP: 0033:0x7f5a975cd25b

    Fixes: 7735730d39d7 ("f2fs: fix to propagate error from __get_meta_page()")
    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Sasha Levin

    Jaegeuk Kim
     
  • [ Upstream commit 7d19e3dab0002e527052b0aaf986e8c32e5537bf ]

    It needs to assign sbi->gc_mode with GC_IDLE_AT rather than GC_AT when
    user tries to enable ATGC via gc_idle sysfs interface, fix it.

    Fixes: 093749e296e2 ("f2fs: support age threshold based garbage collection")
    Cc: Zhipeng Tan
    Signed-off-by: Jicheng Shao
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Sasha Levin

    Chao Yu
     
  • commit 5b5b4f85b01604389f7a0f11ef180a725bf0e2d4 upstream.

    As bughunter reported in bugzilla:

    https://bugzilla.kernel.org/show_bug.cgi?id=215709

    f2fs may hang when mounting a fuzzed image, the dmesg shows as below:

    __filemap_get_folio+0x3a9/0x590
    pagecache_get_page+0x18/0x60
    __get_meta_page+0x95/0x460 [f2fs]
    get_checkpoint_version+0x2a/0x1e0 [f2fs]
    validate_checkpoint+0x8e/0x2a0 [f2fs]
    f2fs_get_valid_checkpoint+0xd0/0x620 [f2fs]
    f2fs_fill_super+0xc01/0x1d40 [f2fs]
    mount_bdev+0x18a/0x1c0
    f2fs_mount+0x15/0x20 [f2fs]
    legacy_get_tree+0x28/0x50
    vfs_get_tree+0x27/0xc0
    path_mount+0x480/0xaa0
    do_mount+0x7c/0xa0
    __x64_sys_mount+0x8b/0xe0
    do_syscall_64+0x38/0xc0
    entry_SYSCALL_64_after_hwframe+0x44/0xae

    The root cause is cp_pack_total_block_count field in checkpoint was fuzzed
    to one, as calcuated, two cp pack block locates in the same block address,
    so then read latter cp pack block, it will block on the page lock due to
    the lock has already held when reading previous cp pack block, fix it by
    adding sanity check for cp_pack_total_block_count.

    Cc: stable@vger.kernel.org
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Chao Yu
     
  • commit 680af5b824a52faa819167628665804a14f0e0df upstream.

    cnt should be passed to sb_has_quota_active() instead of type to check
    active quota properly.

    Moreover, when the type is -1, the compiler with enough inline knowledge
    can discard sb_has_quota_active() check altogether, causing a NULL pointer
    dereference at the following inode_lock(dqopt->files[cnt]):

    [ 2.796010] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a0
    [ 2.796024] Mem abort info:
    [ 2.796025] ESR = 0x96000005
    [ 2.796028] EC = 0x25: DABT (current EL), IL = 32 bits
    [ 2.796029] SET = 0, FnV = 0
    [ 2.796031] EA = 0, S1PTW = 0
    [ 2.796032] Data abort info:
    [ 2.796034] ISV = 0, ISS = 0x00000005
    [ 2.796035] CM = 0, WnR = 0
    [ 2.796046] user pgtable: 4k pages, 39-bit VAs, pgdp=00000003370d1000
    [ 2.796048] [00000000000000a0] pgd=0000000000000000, pud=0000000000000000
    [ 2.796051] Internal error: Oops: 96000005 [#1] PREEMPT SMP
    [ 2.796056] CPU: 7 PID: 640 Comm: f2fs_ckpt-259:7 Tainted: G S 5.4.179-arter97-r8-64666-g2f16e087f9d8 #1
    [ 2.796057] Hardware name: Qualcomm Technologies, Inc. Lahaina MTP lemonadep (DT)
    [ 2.796059] pstate: 80c00005 (Nzcv daif +PAN +UAO)
    [ 2.796065] pc : down_write+0x28/0x70
    [ 2.796070] lr : f2fs_quota_sync+0x100/0x294
    [ 2.796071] sp : ffffffa3f48ffc30
    [ 2.796073] x29: ffffffa3f48ffc30 x28: 0000000000000000
    [ 2.796075] x27: ffffffa3f6d718b8 x26: ffffffa415fe9d80
    [ 2.796077] x25: ffffffa3f7290048 x24: 0000000000000001
    [ 2.796078] x23: 0000000000000000 x22: ffffffa3f7290000
    [ 2.796080] x21: ffffffa3f72904a0 x20: ffffffa3f7290110
    [ 2.796081] x19: ffffffa3f77a9800 x18: ffffffc020aae038
    [ 2.796083] x17: ffffffa40e38e040 x16: ffffffa40e38e6d0
    [ 2.796085] x15: ffffffa40e38e6cc x14: ffffffa40e38e6d0
    [ 2.796086] x13: 00000000000004f6 x12: 00162c44ff493000
    [ 2.796088] x11: 0000000000000400 x10: ffffffa40e38c948
    [ 2.796090] x9 : 0000000000000000 x8 : 00000000000000a0
    [ 2.796091] x7 : 0000000000000000 x6 : 0000d1060f00002a
    [ 2.796093] x5 : ffffffa3f48ff718 x4 : 000000000000000d
    [ 2.796094] x3 : 00000000060c0000 x2 : 0000000000000001
    [ 2.796096] x1 : 0000000000000000 x0 : 00000000000000a0
    [ 2.796098] Call trace:
    [ 2.796100] down_write+0x28/0x70
    [ 2.796102] f2fs_quota_sync+0x100/0x294
    [ 2.796104] block_operations+0x120/0x204
    [ 2.796106] f2fs_write_checkpoint+0x11c/0x520
    [ 2.796107] __checkpoint_and_complete_reqs+0x7c/0xd34
    [ 2.796109] issue_checkpoint_thread+0x6c/0xb8
    [ 2.796112] kthread+0x138/0x414
    [ 2.796114] ret_from_fork+0x10/0x18
    [ 2.796117] Code: aa0803e0 aa1f03e1 52800022 aa0103e9 (c8e97d02)
    [ 2.796120] ---[ end trace 96e942e8eb6a0b53 ]---
    [ 2.800116] Kernel panic - not syncing: Fatal exception
    [ 2.800120] SMP: stopping secondary CPUs

    Fixes: 9de71ede81e6 ("f2fs: quota: fix potential deadlock")
    Cc: # v5.15+
    Signed-off-by: Juhyung Park
    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Juhyung Park
     
  • commit 6d18762ed5cd549fde74fd0e05d4d87bac5a3beb upstream.

    As Pavel Machek reported in below link [1]:

    After commit 77900c45ee5c ("f2fs: fix to do sanity check in is_alive()"),
    node page should be unlock via calling f2fs_put_page() in the error path
    of is_alive(), otherwise, f2fs may hang when it tries to lock the node
    page, fix it.

    [1] https://lore.kernel.org/stable/20220124203637.GA19321@duo.ucw.cz/

    Fixes: 77900c45ee5c ("f2fs: fix to do sanity check in is_alive()")
    Cc:
    Reported-by: Pavel Machek
    Signed-off-by: Pavel Machek
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Chao Yu
     

27 Jan, 2022

8 commits

  • commit b702c83e2eaa2fa2d72e957c55c0321535cc8b9f upstream.

    Otherwise, nat_bit area may be persisted across boundary of CP area during
    nat_bit rebuilding.

    Fixes: 94c821fb286b ("f2fs: rebuild nat_bits during umount")
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Chao Yu
     
  • commit 300a842937fbcfb5a189cea9ba15374fdb0b5c6b upstream.

    https://bugzilla.kernel.org/show_bug.cgi?id=204137

    With below script, we will hit panic during new segment allocation:

    DISK=bingo.img
    MOUNT_DIR=/mnt/f2fs

    dd if=/dev/zero of=$DISK bs=1M count=105
    mkfs.f2fe -a 1 -o 19 -t 1 -z 1 -f -q $DISK

    mount -t f2fs $DISK $MOUNT_DIR -o "noinline_dentry,flush_merge,noextent_cache,mode=lfs,io_bits=7,fsync_mode=strict"

    for (( i = 0; i < 4096; i++ )); do
    name=`head /dev/urandom | tr -dc A-Za-z0-9 | head -c 10`
    mkdir $MOUNT_DIR/$name
    done

    umount $MOUNT_DIR
    rm $DISK

    Chao Yu
     
  • commit 7377e853967ba45bf409e3b5536624d2cbc99f21 upstream.

    There is a potential deadlock between writeback process and a process
    performing write_begin() or write_cache_pages() while trying to write
    same compress file, but not compressable, as below:

    [Process A] - doing checkpoint
    [Process B] [Process C]
    f2fs_write_cache_pages()
    - lock_page() [all pages in cluster, 0-31]
    - f2fs_write_multi_pages()
    - f2fs_write_raw_pages()
    - f2fs_write_single_data_page()
    - f2fs_do_write_data_page()
    - return -EAGAIN [f2fs_trylock_op() failed]
    - unlock_page(page) [e.g., page 0]
    - generic_perform_write()
    - f2fs_write_begin()
    - f2fs_prepare_compress_overwrite()
    - prepare_compress_overwrite()
    - lock_page() [e.g., page 0]
    - lock_page() [e.g., page 1]
    - lock_page(page) [e.g., page 0]

    Since there is no compress process, it is no longer necessary to hold
    locks on every pages in cluster within f2fs_write_raw_pages().

    This patch changes f2fs_write_raw_pages() to release all locks first
    and then perform write same as the non-compress file in
    f2fs_write_cache_pages().

    Fixes: 4c8ff7095bef ("f2fs: support data compression")
    Signed-off-by: Hyeong-Jun Kim
    Signed-off-by: Sungjong Seo
    Signed-off-by: Youngjin Gil
    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Hyeong-Jun Kim
     
  • commit f6db43076d190d9bf75559dec28e18b9d12e4ce5 upstream.

    As report by Wenqing Liu in bugzilla:

    https://bugzilla.kernel.org/show_bug.cgi?id=215231

    If we enable CONFIG_F2FS_CHECK_FS config, and with fuzzed image attached
    in above link, we will encounter panic when executing below script:

    1. mkdir mnt
    2. mount -t f2fs tmp1.img mnt
    3. touch tmp

    F2FS-fs (loop11): mismatched blkaddr 5765 (source_blkaddr 1) in seg 3
    kernel BUG at fs/f2fs/gc.c:1042!
    do_garbage_collect+0x90f/0xa80 [f2fs]
    f2fs_gc+0x294/0x12a0 [f2fs]
    f2fs_balance_fs+0x2c5/0x7d0 [f2fs]
    f2fs_create+0x239/0xd90 [f2fs]
    lookup_open+0x45e/0xa90
    open_last_lookups+0x203/0x670
    path_openat+0xae/0x490
    do_filp_open+0xbc/0x160
    do_sys_openat2+0x2f1/0x500
    do_sys_open+0x5e/0xa0
    __x64_sys_openat+0x28/0x40

    Previously, f2fs tries to catch data inconcistency exception in between
    SSA and SIT table during GC, however once the exception is caught, it will
    call f2fs_bug_on to hang kernel, it's not needed, instead, let's set
    SBI_NEED_FSCK flag and skip migrating current block.

    Fixes: bbf9f7d90f21 ("f2fs: Fix indefinite loop in f2fs_gc()")
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Chao Yu
     
  • commit d1917865a7906baf6b687e15e8e6195a295a3992 upstream.

    Since compress inode not a regular file, generic_error_remove_page in
    f2fs_invalidate_compress_pages will always be failed, set compress
    inode as a regular file to fix it.

    Fixes: 6ce19aff0b8c ("f2fs: compress: add compress_inode to cache compressed blocks")
    Signed-off-by: Fengnan Chang
    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Fengnan Chang
     
  • commit 19bdba5265624ba6b9d9dd936a0c6ccc167cfe80 upstream.

    Android OTA failed due to SBI_NEED_FSCK flag when pinning the file. Let's avoid
    it since we can do in-place-updates.

    Cc: stable@vger.kernel.org
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Jaegeuk Kim
     
  • commit 77900c45ee5cd5da63bd4d818a41dbdf367e81cd upstream.

    In fuzzed image, SSA table may indicate that a data block belongs to
    invalid node, which node ID is out-of-range (0, 1, 2 or max_nid), in
    order to avoid migrating inconsistent data in such corrupted image,
    let's do sanity check anyway before data block migration.

    Cc: stable@vger.kernel.org
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Chao Yu
     
  • commit 9056d6489f5a41cfbb67f719d2c0ce61ead72d9f upstream.

    As report by Wenqing Liu in bugzilla:

    https://bugzilla.kernel.org/show_bug.cgi?id=215231

    - Overview
    kernel NULL pointer dereference triggered in folio_mark_dirty() when mount and operate on a crafted f2fs image

    - Reproduce
    tested on kernel 5.16-rc3, 5.15.X under root

    1. mkdir mnt
    2. mount -t f2fs tmp1.img mnt
    3. touch tmp
    4. cp tmp mnt

    F2FS-fs (loop0): sanity_check_inode: inode (ino=49) extent info [5942, 4294180864, 4] is incorrect, run fsck to fix
    F2FS-fs (loop0): f2fs_check_nid_range: out-of-range nid=31340049, run fsck to fix.
    BUG: kernel NULL pointer dereference, address: 0000000000000000
    folio_mark_dirty+0x33/0x50
    move_data_page+0x2dd/0x460 [f2fs]
    do_garbage_collect+0xc18/0x16a0 [f2fs]
    f2fs_gc+0x1d3/0xd90 [f2fs]
    f2fs_balance_fs+0x13a/0x570 [f2fs]
    f2fs_create+0x285/0x840 [f2fs]
    path_openat+0xe6d/0x1040
    do_filp_open+0xc5/0x140
    do_sys_openat2+0x23a/0x310
    do_sys_open+0x57/0x80

    The root cause is for special file: e.g. character, block, fifo or socket file,
    f2fs doesn't assign address space operations pointer array for mapping->a_ops field,
    so, in a fuzzed image, SSA table indicates a data block belong to special file, when
    f2fs tries to migrate that block, it causes NULL pointer access once move_data_page()
    calls a_ops->set_dirty_page().

    Cc: stable@vger.kernel.org
    Reported-by: Wenqing Liu
    Signed-off-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Chao Yu