12 Aug, 2020

1 commit

  • Pull virtio updates from Michael Tsirkin:

    - IRQ bypass support for vdpa and IFC

    - MLX5 vdpa driver

    - Endianness fixes for virtio drivers

    - Misc other fixes

    * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (71 commits)
    vdpa/mlx5: fix up endian-ness for mtu
    vdpa: Fix pointer math bug in vdpasim_get_config()
    vdpa/mlx5: Fix pointer math in mlx5_vdpa_get_config()
    vdpa/mlx5: fix memory allocation failure checks
    vdpa/mlx5: Fix uninitialised variable in core/mr.c
    vdpa_sim: init iommu lock
    virtio_config: fix up warnings on parisc
    vdpa/mlx5: Add VDPA driver for supported mlx5 devices
    vdpa/mlx5: Add shared memory registration code
    vdpa/mlx5: Add support library for mlx5 VDPA implementation
    vdpa/mlx5: Add hardware descriptive header file
    vdpa: Modify get_vq_state() to return error code
    net/vdpa: Use struct for set/get vq state
    vdpa: remove hard coded virtq num
    vdpasim: support batch updating
    vhost-vdpa: support IOTLB batching hints
    vhost-vdpa: support get/set backend features
    vhost: generialize backend features setting/getting
    vhost-vdpa: refine ioctl pre-processing
    vDPA: dont change vq irq after DRIVER_OK
    ...

    Linus Torvalds
     

05 Aug, 2020

2 commits

  • Virtio fs is modern-only. Use LE accessors for config space.

    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     
  • Pull uninitialized_var() macro removal from Kees Cook:
    "This is long overdue, and has hidden too many bugs over the years. The
    series has several "by hand" fixes, and then a trivial treewide
    replacement.

    - Clean up non-trivial uses of uninitialized_var()

    - Update documentation and checkpatch for uninitialized_var() removal

    - Treewide removal of uninitialized_var()"

    * tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
    compiler: Remove uninitialized_var() macro
    treewide: Remove uninitialized_var() usage
    checkpatch: Remove awareness of uninitialized_var() macro
    mm/debug_vm_pgtable: Remove uninitialized_var() usage
    f2fs: Eliminate usage of uninitialized_var() macro
    media: sur40: Remove uninitialized_var() usage
    KVM: PPC: Book3S PR: Remove uninitialized_var() usage
    clk: spear: Remove uninitialized_var() usage
    clk: st: Remove uninitialized_var() usage
    spi: davinci: Remove uninitialized_var() usage
    ide: Remove uninitialized_var() usage
    rtlwifi: rtl8192cu: Remove uninitialized_var() usage
    b43: Remove uninitialized_var() usage
    drbd: Remove uninitialized_var() usage
    x86/mm/numa: Remove uninitialized_var() usage
    docs: deprecated.rst: Add uninitialized_var()

    Linus Torvalds
     

17 Jul, 2020

1 commit

  • Using uninitialized_var() is dangerous as it papers over real bugs[1]
    (or can in the future), and suppresses unrelated compiler warnings
    (e.g. "unused variable"). If the compiler thinks it is uninitialized,
    either simply initialize the variable or make compiler changes.

    In preparation for removing[2] the[3] macro[4], remove all remaining
    needless uses with the following script:

    git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \
    xargs perl -pi -e \
    's/\buninitialized_var\(([^\)]+)\)/\1/g;
    s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;'

    drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid
    pathological white-space.

    No outstanding warnings were found building allmodconfig with GCC 9.3.0
    for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64,
    alpha, and m68k.

    [1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/
    [2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/
    [3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/
    [4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/

    Reviewed-by: Leon Romanovsky # drivers/infiniband and mlx4/mlx5
    Acked-by: Jason Gunthorpe # IB
    Acked-by: Kalle Valo # wireless drivers
    Reviewed-by: Chao Yu # erofs
    Signed-off-by: Kees Cook

    Kees Cook
     

15 Jul, 2020

1 commit

  • The ioctl encoding for this parameter is a long but the documentation says
    it should be an int and the kernel drivers expect it to be an int. If the
    fuse driver treats this as a long it might end up scribbling over the stack
    of a userspace process that only allocated enough space for an int.

    This was previously discussed in [1] and a patch for fuse was proposed in
    [2]. From what I can tell the patch in [2] was nacked in favor of adding
    new, "fixed" ioctls and using those from userspace. However there is still
    no "fixed" version of these ioctls and the fact is that it's sometimes
    infeasible to change all userspace to use the new one.

    Handling the ioctls specially in the fuse driver seems like the most
    pragmatic way for fuse servers to support them without causing crashes in
    userspace applications that call them.

    [1]: https://lore.kernel.org/linux-fsdevel/20131126200559.GH20559@hall.aurel32.net/T/
    [2]: https://sourceforge.net/p/fuse/mailman/message/31771759/

    Signed-off-by: Chirantan Ekbote
    Fixes: 59efec7b9039 ("fuse: implement ioctl support")
    Cc:
    Signed-off-by: Miklos Szeredi

    Chirantan Ekbote
     

14 Jul, 2020

7 commits

  • fuse_writepages() ignores some errors taken from fuse_writepages_fill() I
    believe it is a bug: if .writepages is called with WB_SYNC_ALL it should
    either guarantee that all data was successfully saved or return error.

    Fixes: 26d614df1da9 ("fuse: Implement writepages callback")
    Signed-off-by: Vasily Averin
    Signed-off-by: Miklos Szeredi

    Vasily Averin
     
  • fuse_writepages_fill uses following construction:

    if (wpa && ap->num_pages &&
    (A || B || C)) {
    action;
    } else if (wpa && D) {
    if (E) {
    the same action;
    }
    }

    - ap->num_pages check is always true and can be removed

    - "if" and "else if" calls the same action and can be merged.

    Move checking A, B, C, D, E conditions to a helper, add comments.

    Original-patch-by: Vasily Averin
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • Previous patch changed handling of remount/reconfigure to ignore all
    options, including those that are unknown to the fuse kernel fs. This was
    done for backward compatibility, but this likely only affects the old
    mount(2) API.

    The new fsconfig(2) based reconfiguration could possibly be improved. This
    would make the new API less of a drop in replacement for the old, OTOH this
    is a good chance to get rid of some weirdnesses in the old API.

    Several other behaviors might make sense:

    1) unknown options are rejected, known options are ignored

    2) unknown options are rejected, known options are rejected if the value
    is changed, allowed otherwise

    3) all options are rejected

    Prior to the backward compatibility fix to ignore all options all known
    options were accepted (1), even if they change the value of a mount
    parameter; fuse_reconfigure() does not look at the config values set by
    fuse_parse_param().

    To fix that we'd need to verify that the value provided is the same as set
    in the initial configuration (2). The major drawback is that this is much
    more complex than just rejecting all attempts at changing options (3);
    i.e. all options signify initial configuration values and don't make sense
    on reconfigure.

    This patch opts for (3) with the rationale that no mount options are
    reconfigurable in fuse.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • The command

    mount -o remount -o unknownoption /mnt/fuse

    succeeds on kernel versions prior to v5.4 and fails on kernel version at or
    after. This is because fuse_parse_param() rejects any unrecognised options
    in case of FS_CONTEXT_FOR_RECONFIGURE, just as for FS_CONTEXT_FOR_MOUNT.

    This causes a regression in case the fuse filesystem is in fstab, since
    remount sends all options found there to the kernel; even ones that are
    meant for the initial mount and are consumed by the userspace fuse server.

    Fix this by ignoring mount options, just as fuse_remount_fs() did prior to
    the conversion to the new API.

    Reported-by: Stefan Priebe
    Fixes: c30da2e981a7 ("fuse: convert to use the new mount API")
    Cc: # v5.4
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • s_op->remount_fs() is only called from legacy_reconfigure(), which is not
    used after being converted to the new API.

    Convert to using ->reconfigure(). This restores the previous behavior of
    syncing the filesystem and rejecting MS_MANDLOCK on remount.

    Fixes: c30da2e981a7 ("fuse: convert to use the new mount API")
    Cc: # v5.4
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • fuse_writepages_fill() calls tree_insert() with ap->num_pages = 0 which
    triggers the following warning:

    WARNING: CPU: 1 PID: 17211 at fs/fuse/file.c:1728 tree_insert+0xab/0xc0 [fuse]
    RIP: 0010:tree_insert+0xab/0xc0 [fuse]
    Call Trace:
    fuse_writepages_fill+0x5da/0x6a0 [fuse]
    write_cache_pages+0x171/0x470
    fuse_writepages+0x8a/0x100 [fuse]
    do_writepages+0x43/0xe0

    Fix up the warning and clean up the code around rb-tree insertion:

    - Rename tree_insert() to fuse_insert_writeback() and make it return the
    conflicting entry in case of failure

    - Re-add tree_insert() as a wrapper around fuse_insert_writeback()

    - Rename fuse_writepage_in_flight() to fuse_writepage_add() and reverse
    the meaning of the return value to mean

    + "true" in case the writepage entry was successfully added

    + "false" in case it was in-fligt queued on an existing writepage
    entry's auxiliary list or the existing writepage entry's temporary
    page updated

    Switch from fuse_find_writeback() + tree_insert() to
    fuse_insert_writeback()

    - Move setting orig_pages to before inserting/updating the entry; this may
    result in the orig_pages value being discarded later in case of an
    in-flight request

    - In case of a new writepage entry use fuse_writepage_add()
    unconditionally, only set data->wpa if the entry was added.

    Fixes: 6b2fb79963fb ("fuse: optimize writepages search")
    Reported-by: kernel test robot
    Original-path-by: Vasily Averin
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • In fuse_writepage_end() the old writepages entry needs to be removed from
    the rbtree before inserting the new one, otherwise tree_insert() would
    fail. This is a very rare codepath and no reproducer exists.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     

10 Jun, 2020

1 commit

  • Pull fuse updates from Miklos Szeredi:

    - Fix a rare deadlock in virtiofs

    - Fix st_blocks in writeback cache mode

    - Fix wrong checks in splice move causing spurious warnings

    - Fix a race between a GETATTR request and a FUSE_NOTIFY_INVAL_INODE
    notification

    - Use rb-tree instead of linear search for pages currently under
    writeout by userspace

    - Fix copy_file_range() inconsistencies

    * tag 'fuse-update-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
    fuse: copy_file_range should truncate cache
    fuse: fix copy_file_range cache issues
    fuse: optimize writepages search
    fuse: update attr_version counter on fuse_notify_inval_inode()
    fuse: don't check refcount after stealing page
    fuse: fix weird page warning
    fuse: use dump_page
    virtiofs: do not use fuse_fill_super_common() for device installation
    fuse: always allow query of st_dev
    fuse: always flush dirty data on close(2)
    fuse: invalidate inode attr in writeback cache mode
    fuse: Update stale comment in queue_interrupt()
    fuse: BUG_ON correction in fuse_dev_splice_write()
    virtiofs: Add mount option and atime behavior to the doc
    virtiofs: schedule blocking async replies in separate worker

    Linus Torvalds
     

04 Jun, 2020

3 commits

  • Merge more updates from Andrew Morton:
    "More mm/ work, plenty more to come

    Subsystems affected by this patch series: slub, memcg, gup, kasan,
    pagealloc, hugetlb, vmscan, tools, mempolicy, memblock, hugetlbfs,
    thp, mmap, kconfig"

    * akpm: (131 commits)
    arm64: mm: use ARCH_HAS_DEBUG_WX instead of arch defined
    x86: mm: use ARCH_HAS_DEBUG_WX instead of arch defined
    riscv: support DEBUG_WX
    mm: add DEBUG_WX support
    drivers/base/memory.c: cache memory blocks in xarray to accelerate lookup
    mm/thp: rename pmd_mknotpresent() as pmd_mkinvalid()
    powerpc/mm: drop platform defined pmd_mknotpresent()
    mm: thp: don't need to drain lru cache when splitting and mlocking THP
    hugetlbfs: get unmapped area below TASK_UNMAPPED_BASE for hugetlbfs
    sparc32: register memory occupied by kernel as memblock.memory
    include/linux/memblock.h: fix minor typo and unclear comment
    mm, mempolicy: fix up gup usage in lookup_node
    tools/vm/page_owner_sort.c: filter out unneeded line
    mm: swap: memcg: fix memcg stats for huge pages
    mm: swap: fix vmstats for huge pages
    mm: vmscan: limit the range of LRU type balancing
    mm: vmscan: reclaim writepage is IO cost
    mm: vmscan: determine anon/file pressure balance at the reclaim root
    mm: balance LRU lists based on relative thrashing
    mm: only count actual rotations as LRU reclaim cost
    ...

    Linus Torvalds
     
  • They're the same function, and for the purpose of all callers they are
    equivalent to lru_cache_add().

    [akpm@linux-foundation.org: fix it for local_lock changes]
    Signed-off-by: Johannes Weiner
    Signed-off-by: Andrew Morton
    Reviewed-by: Rik van Riel
    Acked-by: Michal Hocko
    Acked-by: Minchan Kim
    Cc: Joonsoo Kim
    Link: http://lkml.kernel.org/r/20200520232525.798933-5-hannes@cmpxchg.org
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • Pull splice updates from Al Viro:
    "Christoph's assorted splice cleanups"

    * 'work.splice' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    fs: rename pipe_buf ->steal to ->try_steal
    fs: make the pipe_buf_operations ->confirm operation optional
    fs: make the pipe_buf_operations ->steal operation optional
    trace: remove tracing_pipe_buf_ops
    pipe: merge anon_pipe_buf*_ops
    fs: simplify do_splice_from
    fs: simplify do_splice_to

    Linus Torvalds
     

03 Jun, 2020

2 commits

  • Merge updates from Andrew Morton:
    "A few little subsystems and a start of a lot of MM patches.

    Subsystems affected by this patch series: squashfs, ocfs2, parisc,
    vfs. With mm subsystems: slab-generic, slub, debug, pagecache, gup,
    swap, memcg, pagemap, memory-failure, vmalloc, kasan"

    * emailed patches from Andrew Morton : (128 commits)
    kasan: move kasan_report() into report.c
    mm/mm_init.c: report kasan-tag information stored in page->flags
    ubsan: entirely disable alignment checks under UBSAN_TRAP
    kasan: fix clang compilation warning due to stack protector
    x86/mm: remove vmalloc faulting
    mm: remove vmalloc_sync_(un)mappings()
    x86/mm/32: implement arch_sync_kernel_mappings()
    x86/mm/64: implement arch_sync_kernel_mappings()
    mm/ioremap: track which page-table levels were modified
    mm/vmalloc: track which page-table levels were modified
    mm: add functions to track page directory modifications
    s390: use __vmalloc_node in stack_alloc
    powerpc: use __vmalloc_node in alloc_vm_stack
    arm64: use __vmalloc_node in arch_alloc_vmap_stack
    mm: remove vmalloc_user_node_flags
    mm: switch the test_vmalloc module to use __vmalloc_node
    mm: remove __vmalloc_node_flags_caller
    mm: remove both instances of __vmalloc_node_flags
    mm: remove the prot argument to __vmalloc_node
    mm: remove the pgprot argument to __vmalloc
    ...

    Linus Torvalds
     
  • Implement the new readahead operation in fuse by using __readahead_batch()
    to fill the array of pages in fuse_args_pages directly. This lets us
    inline fuse_readpages_fill() into fuse_readahead().

    [willy@infradead.org: build fix]
    Link: http://lkml.kernel.org/r/20200415025938.GB5820@bombadil.infradead.org
    Signed-off-by: Matthew Wilcox (Oracle)
    Signed-off-by: Andrew Morton
    Reviewed-by: Dave Chinner
    Reviewed-by: William Kucharski
    Acked-by: Miklos Szeredi
    Cc: Chao Yu
    Cc: Christoph Hellwig
    Cc: Cong Wang
    Cc: Darrick J. Wong
    Cc: Eric Biggers
    Cc: Gao Xiang
    Cc: Jaegeuk Kim
    Cc: John Hubbard
    Cc: Joseph Qi
    Cc: Junxiao Bi
    Cc: Michal Hocko
    Cc: Zi Yan
    Cc: Johannes Thumshirn
    Link: http://lkml.kernel.org/r/20200414150233.24495-25-willy@infradead.org
    Signed-off-by: Linus Torvalds

    Matthew Wilcox (Oracle)
     

21 May, 2020

1 commit


20 May, 2020

2 commits

  • After the copy operation completes the cache is not up-to-date. Truncate
    all pages in the interval that has successfully been copied.

    Truncating completely copied dirty pages is okay, since the data has been
    overwritten anyway. Truncating partially copied dirty pages is not okay;
    add a comment for now.

    Fixes: 88bc7d5097a1 ("fuse: add support for copy_file_range()")
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • a) Dirty cache needs to be written back not just in the writeback_cache
    case, since the dirty pages may come from memory maps.

    b) The fuse_writeback_range() helper takes an inclusive interval, so the
    end position needs to be pos+len-1 instead of pos+len.

    Fixes: 88bc7d5097a1 ("fuse: add support for copy_file_range()")
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     

19 May, 2020

9 commits

  • Re-work fi->writepages, replacing list with rb-tree. This improves
    performance because kernel fuse iterates through fi->writepages for each
    writeback page and typical number of entries is about 800 (for 100MB of
    fuse writeback).

    Before patch:

    10240+0 records in
    10240+0 records out
    10737418240 bytes (11 GB) copied, 41.3473 s, 260 MB/s

    2 1 0 57445400 40416 6323676 0 0 33 374743 8633 19210 1 8 88 3 0

    29.86% [kernel] [k] _raw_spin_lock
    26.62% [fuse] [k] fuse_page_is_writeback

    After patch:

    10240+0 records in
    10240+0 records out
    10737418240 bytes (11 GB) copied, 21.4954 s, 500 MB/s

    2 9 0 53676040 31744 10265984 0 0 64 854790 10956 48387 1 6 88 6 0

    23.55% [kernel] [k] copy_user_enhanced_fast_string
    9.87% [kernel] [k] __memcpy
    3.10% [kernel] [k] _raw_spin_lock

    Signed-off-by: Maxim Patlasov
    Signed-off-by: Vasily Averin
    Signed-off-by: Miklos Szeredi

    Maxim Patlasov
     
  • A GETATTR request can race with FUSE_NOTIFY_INVAL_INODE, resulting in the
    attribute cache being updated with stale information after the
    invalidation.

    Fix this by bumping the attribute version in fuse_reverse_inval_inode().

    Reported-by: Krzysztof Rusek
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • page_count() is unstable. Unless there has been an RCU grace period
    between when the page was removed from the page cache and now, a
    speculative reference may exist from the page cache.

    Reported-by: Matthew Wilcox
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • When PageWaiters was added, updating this check was missed.

    Reported-by: Nikolaus Rath
    Reported-by: Hugh Dickins
    Fixes: 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit")
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • Instead of custom page dumping, use the standard helper.

    Reported-by: Matthew Wilcox
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • fuse_fill_super_common() allocates and installs one fuse_device. Hence
    virtiofs allocates and install all fuse devices by itself except one.

    This makes logic little twisted. There does not seem to be any real need
    that why virtiofs can't allocate and install all fuse devices itself.

    So opt out of fuse device allocation and installation while calling
    fuse_fill_super_common().

    Regular fuse still wants fuse_fill_super_common() to install fuse_device.
    It needs to prevent against races where two mounters are trying to mount
    fuse using same fd. In that case one will succeed while other will get
    -EINVAL.

    virtiofs does not have this issue because sget_fc() resolves the race
    w.r.t multiple mounters and only one instance of virtio_fs_fill_super()
    should be in progress for same filesystem.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Miklos Szeredi

    Vivek Goyal
     
  • Fuse mounts without "allow_other" are off-limits to all non-owners. Yet it
    makes sense to allow querying st_dev on the root, since this value is
    provided by the kernel, not the userspace filesystem.

    Allow statx(2) with a zero request mask to succeed on a fuse mounts for all
    users.

    Reported-by: Nikolaus Rath
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • We want cached data to synced with the userspace filesystem on close(), for
    example to allow getting correct st_blocks value. Do this regardless of
    whether the userspace filesystem implements a FLUSH method or not.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • Under writeback mode, inode->i_blocks is not updated, making utils du
    read st.blocks as 0.

    For example, when using virtiofs (cache=always & nondax mode) with
    writeback_cache enabled, writing a new file and check its disk usage
    with du, du reports 0 usage.

    # uname -r
    5.6.0-rc6+
    # mount -t virtiofs virtiofs /mnt/virtiofs
    # rm -f /mnt/virtiofs/testfile

    # create new file and do extend write
    # xfs_io -fc "pwrite 0 4k" /mnt/virtiofs/testfile
    wrote 4096/4096 bytes at offset 0
    4 KiB, 1 ops; 0.0001 sec (28.103 MiB/sec and 7194.2446 ops/sec)
    # du -k /mnt/virtiofs/testfile
    0
    Signed-off-by: Miklos Szeredi

    Eryu Guan
     

21 Apr, 2020

1 commit

  • Several references got broken due to txt to ReST conversion.

    Several of them can be automatically fixed with:

    scripts/documentation-file-ref-check --fix

    Reviewed-by: Mathieu Poirier # hwtracing/coresight/Kconfig
    Reviewed-by: Paul E. McKenney # memory-barrier.txt
    Acked-by: Alex Shi # translations/zh_CN
    Acked-by: Federico Vaga # translations/it_IT
    Acked-by: Marc Zyngier # kvm/arm64
    Signed-off-by: Mauro Carvalho Chehab
    Link: https://lore.kernel.org/r/6f919ddb83a33b5f2a63b6b5f0575737bb2b36aa.1586881715.git.mchehab+huawei@kernel.org
    Signed-off-by: Jonathan Corbet

    Mauro Carvalho Chehab
     

20 Apr, 2020

3 commits

  • Fixes: 04ec5af0776e "fuse: export fuse_end_request()"
    Signed-off-by: Kirill Tkhai
    Signed-off-by: Miklos Szeredi

    Kirill Tkhai
     
  • commit 963545357202 ("fuse: reduce allocation size for splice_write")
    changed size of bufs array, so BUG_ON which checks the index of the array
    shold also be fixed.

    [SzM: turn BUG_ON into WARN_ON]

    Fixes: 963545357202 ("fuse: reduce allocation size for splice_write")
    Signed-off-by: Vasily Averin
    Signed-off-by: Miklos Szeredi

    Vasily Averin
     
  • In virtiofs (unlike in regular fuse) processing of async replies is
    serialized. This can result in a deadlock in rare corner cases when
    there's a circular dependency between the completion of two or more async
    replies.

    Such a deadlock can be reproduced with xfstests:generic/503 if TEST_DIR ==
    SCRATCH_MNT (which is a misconfiguration):

    - Process A is waiting for page lock in worker thread context and blocked
    (virtio_fs_requests_done_work()).
    - Process B is holding page lock and waiting for pending writes to
    finish (fuse_wait_on_page_writeback()).
    - Write requests are waiting in virtqueue and can't complete because
    worker thread is blocked on page lock (process A).

    Fix this by creating a unique work_struct for each async reply that can
    block (O_DIRECT read).

    Fixes: a62a8ef9d97d ("virtio-fs: add virtiofs filesystem")
    Signed-off-by: Vivek Goyal
    Signed-off-by: Miklos Szeredi

    Vivek Goyal
     

13 Feb, 2020

1 commit

  • Normal, synchronous requests will have their args allocated on the stack.
    After the FR_FINISHED bit is set by receiving the reply from the userspace
    fuse server, the originating task may return and reuse the stack frame,
    resulting in an Oops if the args structure is dereferenced.

    Fix by setting a flag in the request itself upon initializing, indicating
    whether it has an asynchronous ->end() callback.

    Reported-by: Kyle Sanderson
    Reported-by: Michael Stapelberg
    Fixes: 2b319d1f6f92 ("fuse: don't dereference req->args on finished request")
    Cc: # v5.4
    Tested-by: Michael Stapelberg
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     

09 Feb, 2020

1 commit

  • Pull vfs file system parameter updates from Al Viro:
    "Saner fs_parser.c guts and data structures. The system-wide registry
    of syntax types (string/enum/int32/oct32/.../etc.) is gone and so is
    the horror switch() in fs_parse() that would have to grow another case
    every time something got added to that system-wide registry.

    New syntax types can be added by filesystems easily now, and their
    namespace is that of functions - not of system-wide enum members. IOW,
    they can be shared or kept private and if some turn out to be widely
    useful, we can make them common library helpers, etc., without having
    to do anything whatsoever to fs_parse() itself.

    And we already get that kind of requests - the thing that finally
    pushed me into doing that was "oh, and let's add one for timeouts -
    things like 15s or 2h". If some filesystem really wants that, let them
    do it. Without somebody having to play gatekeeper for the variants
    blessed by direct support in fs_parse(), TYVM.

    Quite a bit of boilerplate is gone. And IMO the data structures make a
    lot more sense now. -200LoC, while we are at it"

    * 'merge.nfs-fs_parse.1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (25 commits)
    tmpfs: switch to use of invalfc()
    cgroup1: switch to use of errorfc() et.al.
    procfs: switch to use of invalfc()
    hugetlbfs: switch to use of invalfc()
    cramfs: switch to use of errofc() et.al.
    gfs2: switch to use of errorfc() et.al.
    fuse: switch to use errorfc() et.al.
    ceph: use errorfc() and friends instead of spelling the prefix out
    prefix-handling analogues of errorf() and friends
    turn fs_param_is_... into functions
    fs_parse: handle optional arguments sanely
    fs_parse: fold fs_parameter_desc/fs_parameter_spec
    fs_parser: remove fs_parameter_description name field
    add prefix to fs_context->log
    ceph_parse_param(), ceph_parse_mon_ips(): switch to passing fc_log
    new primitive: __fs_parse()
    switch rbd and libceph to p_log-based primitives
    struct p_log, variants of warnf() et.al. taking that one instead
    teach logfc() to handle prefices, give it saner calling conventions
    get rid of cg_invalf()
    ...

    Linus Torvalds
     

08 Feb, 2020

3 commits


06 Feb, 2020

1 commit

  • Fixes coccicheck warning:

    fs/fuse/readdir.c:335:1-19: WARNING: Assignment of 0/1 to bool variable
    fs/fuse/file.c:1398:2-19: WARNING: Assignment of 0/1 to bool variable
    fs/fuse/file.c:1400:2-20: WARNING: Assignment of 0/1 to bool variable
    fs/fuse/cuse.c:454:1-20: WARNING: Assignment of 0/1 to bool variable
    fs/fuse/cuse.c:455:1-19: WARNING: Assignment of 0/1 to bool variable
    fs/fuse/inode.c:497:2-17: WARNING: Assignment of 0/1 to bool variable
    fs/fuse/inode.c:504:2-23: WARNING: Assignment of 0/1 to bool variable
    fs/fuse/inode.c:511:2-22: WARNING: Assignment of 0/1 to bool variable
    fs/fuse/inode.c:518:2-23: WARNING: Assignment of 0/1 to bool variable
    fs/fuse/inode.c:522:2-26: WARNING: Assignment of 0/1 to bool variable
    fs/fuse/inode.c:526:2-18: WARNING: Assignment of 0/1 to bool variable
    fs/fuse/inode.c:1000:1-20: WARNING: Assignment of 0/1 to bool variable

    Reported-by: Hulk Robot
    Signed-off-by: zhengbin
    Signed-off-by: Miklos Szeredi

    zhengbin