17 Aug, 2022
3 commits
-
[ Upstream commit c64797809a64c73497082aa05e401a062ec1af34 ]
The commit 15c8e72e88e0 ("fuse: allow skipping control interface and forced
unmount") tries to remove the control interface for virtio-fs since it does
not support aborting requests which are being processed. But it doesn't
work now.This patch fixes it by skipping creating the control interface if
fuse_conn->no_control is set.Fixes: 15c8e72e88e0 ("fuse: allow skipping control interface and forced unmount")
Signed-off-by: Xie Yongji
Signed-off-by: Miklos Szeredi
Signed-off-by: Sasha Levin -
commit 02c0cab8e7345b06f1c0838df444e2902e4138d3 upstream.
Overlayfs may fail to complete updates when a filesystem lacks
fileattr/xattr syscall support and responds with an ENOSYS error code,
resulting in an unexpected "Function not implemented" error.This bug may occur with FUSE filesystems, such as davfs2.
Steps to reproduce:
# install davfs2, e.g., apk add davfs2
mkdir /test mkdir /test/lower /test/upper /test/work /test/mnt
yes '' | mount -t davfs -o ro http://some-web-dav-server/path \
/test/lower
mount -t overlay -o upperdir=/test/upper,lowerdir=/test/lower \
-o workdir=/test/work overlay /test/mnt# when "some-file" exists in the lowerdir, this fails with "Function
# not implemented", with dmesg showing "overlayfs: failed to retrieve
# lower fileattr (/some-file, err=-38)"
touch /test/mnt/some-fileThe underlying cause of this regresion is actually in FUSE, which fails to
translate the ENOSYS error code returned by userspace filesystem (which
means that the ioctl operation is not supported) to ENOTTY.Reported-by: Christian Kohlschütter
Fixes: 72db82115d2b ("ovl: copy up sync/noatime fileattr flags")
Fixes: 59efec7b9039 ("fuse: implement ioctl support")
Cc:
Signed-off-by: Miklos Szeredi
Signed-off-by: Greg Kroah-Hartman -
commit 47912eaa061a6a81e4aa790591a1874c650733c0 upstream.
Limit nanoseconds to 0..999999999.
Fixes: d8a5ba45457e ("[PATCH] FUSE - core")
Cc:
Signed-off-by: Miklos Szeredi
Signed-off-by: Greg Kroah-Hartman
01 May, 2022
1 commit
-
commit a6294593e8a1290091d0b078d5d33da5e0cd3dfe upstream
Turn iov_iter_fault_in_readable into a function that returns the number
of bytes not faulted in, similar to copy_to_user, instead of returning a
non-zero value when any of the requested pages couldn't be faulted in.
This supports the existing users that require all pages to be faulted in
as well as new users that are happy if any pages can be faulted in.Rename iov_iter_fault_in_readable to fault_in_iov_iter_readable to make
sure this change doesn't silently break things.Signed-off-by: Andreas Gruenbacher
Signed-off-by: Anand Jain
Signed-off-by: Greg Kroah-Hartman
16 Mar, 2022
2 commits
-
commit 0c4bcfdecb1ac0967619ee7ff44871d93c08c909 upstream.
In FOPEN_DIRECT_IO mode, fuse_file_write_iter() calls
fuse_direct_write_iter(), which normally calls fuse_direct_io(), which then
imports the write buffer with fuse_get_user_pages(), which uses
iov_iter_get_pages() to grab references to userspace pages instead of
actually copying memory.On the filesystem device side, these pages can then either be read to
userspace (via fuse_dev_read()), or splice()d over into a pipe using
fuse_dev_splice_read() as pipe buffers with &nosteal_pipe_buf_ops.This is wrong because after fuse_dev_do_read() unlocks the FUSE request,
the userspace filesystem can mark the request as completed, causing write()
to return. At that point, the userspace filesystem should no longer have
access to the pipe buffer.Fix by copying pages coming from the user address space to new pipe
buffers.Reported-by: Jann Horn
Fixes: c3021629a0d8 ("fuse: support splice() reading from fuse device")
Cc:
Signed-off-by: Miklos Szeredi
Signed-off-by: Greg Kroah-Hartman -
commit a679a61520d8a7b0211a1da990404daf5cc80b72 upstream.
The fileattr API conversion broke lsattr on ntfs3g.
Previously the ioctl(... FS_IOC_GETFLAGS) returned an EINVAL error, but
after the conversion the error returned by the fuse filesystem was not
propagated back to the ioctl() system call, resulting in success being
returned with bogus values.Fix by checking for outarg.result in fuse_priv_ioctl(), just as generic
ioctl code does.Reported-by: Jean-Pierre André
Fixes: 72227eac177d ("fuse: convert to fileattr")
Cc: # v5.13
Signed-off-by: Miklos Szeredi
Signed-off-by: Greg Kroah-Hartman
27 Jan, 2022
1 commit
-
commit e388164ea385f04666c4633f5dc4f951fca71890 upstream.
The acceptable maximum value of lend parameter in
filemap_write_and_wait_range() is LLONG_MAX rather than -1. And there is
also some logic depending on LLONG_MAX check in write_cache_pages(). So
let's pass LLONG_MAX to filemap_write_and_wait_range() in
fuse_writeback_range() instead.Fixes: 59bda8ecee2f ("fuse: flush extending writes")
Signed-off-by: Xie Yongji
Cc: # v5.15
Signed-off-by: Miklos Szeredi
Signed-off-by: Greg Kroah-Hartman
22 Dec, 2021
1 commit
-
commit bda9a71980e083699a0360963c0135657b73f47a upstream.
Add missing inode lock annotatation; found by syzbot.
Reported-and-tested-by: syzbot+9f747458f5990eaa8d43@syzkaller.appspotmail.com
Signed-off-by: Miklos Szeredi
Signed-off-by: Greg Kroah-Hartman
17 Dec, 2021
1 commit
-
commit 5c791fe1e2a4f401f819065ea4fc0450849f1818 upstream.
In writeback cache mode mtime/ctime updates are cached, and flushed to the
server using the ->write_inode() callback.Closing the file will result in a dirty inode being immediately written,
but in other cases the inode can remain dirty after all references are
dropped. This result in the inode being written back from reclaim, which
can deadlock on a regular allocation while the request is being served.The usual mechanisms (GFP_NOFS/PF_MEMALLOC*) don't work for FUSE, because
serving a request involves unrelated userspace process(es).Instead do the same as for dirty pages: make sure the inode is written
before the last reference is gone.- fallocate(2)/copy_file_range(2): these call file_update_time() or
file_modified(), so flush the inode before returning from the call- unlink(2), link(2) and rename(2): these call fuse_update_ctime(), so
flush the ctime directly from this helperReported-by: chenguanyou
Signed-off-by: Miklos Szeredi
Cc: Ed Tsai
Signed-off-by: Greg Kroah-Hartman
01 Dec, 2021
1 commit
-
commit 473441720c8616dfaf4451f9c7ea14f0eb5e5d65 upstream.
Checking buf->flags should be done before the pipe_buf_release() is called
on the pipe buffer, since releasing the buffer might modify the flags.This is exactly what page_cache_pipe_buf_release() does, and which results
in the same VM_BUG_ON_PAGE(PageLRU(page)) that the original patch was
trying to fix.Reported-by: Justin Forbes
Fixes: 712a951025c0 ("fuse: fix page stealing")
Cc: # v2.6.35
Signed-off-by: Miklos Szeredi
Signed-off-by: Greg Kroah-Hartman
19 Nov, 2021
1 commit
-
commit 712a951025c0667ff00b25afc360f74e639dfabe upstream.
It is possible to trigger a crash by splicing anon pipe bufs to the fuse
device.The reason for this is that anon_pipe_buf_release() will reuse buf->page if
the refcount is 1, but that page might have already been stolen and its
flags modified (e.g. PG_lru added).This happens in the unlikely case of fuse_dev_splice_write() getting around
to calling pipe_buf_release() after a page has been stolen, added to the
page cache and removed from the page cache.Fix by calling pipe_buf_release() right after the page was inserted into
the page cache. In this case the page has an elevated refcount so any
release function will know that the page isn't reusable.Reported-by: Frank Dinoff
Link: https://lore.kernel.org/r/CAAmZXrsGg2xsP1CK+cbuEMumtrqdvD-NKnWzhNcvn71RV3c1yw@mail.gmail.com/
Fixes: dd3bb14f44a6 ("fuse: support splice() writing to fuse device")
Cc: # v2.6.35
Signed-off-by: Miklos Szeredi
Signed-off-by: Greg Kroah-Hartman
21 Oct, 2021
5 commits
-
Instead of "goto err", return error directly, since there's no error
cleanup to do now.Signed-off-by: Miklos Szeredi
-
Syzkaller reports a null pointer dereference in fuse_test_super() that is
caused by sb->s_fs_info being NULL.This is due to the fact that fuse_fill_super() is initializing s_fs_info,
which is too late, it's already on the fs_supers list. The initialization
needs to be done in sget_fc() with the sb_lock held.Move allocation of fuse_mount and fuse_conn from fuse_fill_super() into
fuse_get_tree().After this ->kill_sb() will always be called with non-NULL ->s_fs_info,
hence fuse_mount_destroy() can drop the test for non-NULL "fm".Reported-by: syzbot+74a15f02ccb51f398601@syzkaller.appspotmail.com
Fixes: 5d5b74aa9c76 ("fuse: allow sharing existing sb")
Signed-off-by: Miklos Szeredi -
1. call fuse_mount_destroy() for open coded variants
2. before deactivate_locked_super() don't need fuse_mount destruction since
that will now be done (if ->s_fs_info is not cleared)3. rearrange fuse_mount setup in fuse_get_tree_submount() so that the
regular pattern can be usedSigned-off-by: Miklos Szeredi
-
The ->put_super callback is called from generic_shutdown_super() in case of
a fully initialized sb. This is called from kill_***_super(), which is
called from ->kill_sb instances.Fuse uses ->put_super to destroy the fs specific fuse_mount and drop the
reference to the fuse_conn, while it does the same on each error case
during sb setup.This patch moves the destruction from fuse_put_super() to
fuse_mount_destroy(), called at the end of all ->kill_sb instances. A
follup patch will clean up the error paths.Signed-off-by: Miklos Szeredi
-
Checking "fm" works because currently sb->s_fs_info is cleared on error
paths; however, sb->s_root is what generic_shutdown_super() checks to
determine whether the sb was fully initialized or not.This change will allow cleanup of sb setup error paths.
Signed-off-by: Miklos Szeredi
08 Sep, 2021
1 commit
-
Pull fuse updates from Miklos Szeredi:
- Allow mounting an active fuse device. Previously the fuse device
would always be mounted during initialization, and sharing a fuse
superblock was only possible through mount or namespace cloning- Fix data flushing in syncfs (virtiofs only)
- Fix data flushing in copy_file_range()
- Fix a possible deadlock in atomic O_TRUNC
- Misc fixes and cleanups
* tag 'fuse-update-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: remove unused arg in fuse_write_file_get()
fuse: wait for writepages in syncfs
fuse: flush extending writes
fuse: truncate pagecache on atomic_o_trunc
fuse: allow sharing existing sb
fuse: move fget() to fuse_get_tree()
fuse: move option checking into fuse_fill_super()
fuse: name fs_context consistently
fuse: fix use after free in fuse_read_interrupt()
06 Sep, 2021
2 commits
-
The struct fuse_conn argument is not used and can be removed.
Signed-off-by: Miklos Szeredi
-
In case of fuse the MM subsystem doesn't guarantee that page writeback
completes by the time ->sync_fs() is called. This is because fuse
completes page writeback immediately to prevent DoS of memory reclaim by
the userspace file server.This means that fuse itself must ensure that writes are synced before
sending the SYNCFS request to the server.Introduce sync buckets, that hold a counter for the number of outstanding
write requests. On syncfs replace the current bucket with a new one and
wait until the old bucket's counter goes down to zero.It is possible to have multiple syncfs calls in parallel, in which case
there could be more than one waited-on buckets. Descendant buckets must
not complete until the parent completes. Add a count to the child (new)
bucket until the (parent) old bucket completes.Use RCU protection to dereference the current bucket and to wake up an
emptied bucket. Use fc->lock to protect against parallel assignments to
the current bucket.This leaves just the counter to be a possible scalability issue. The
fc->num_waiting counter has a similar issue, so both should be addressed at
the same time.Reported-by: Amir Goldstein
Fixes: 2d82ab251ef0 ("virtiofs: propagate sync() to file server")
Cc: # v5.14
Signed-off-by: Miklos Szeredi
03 Sep, 2021
1 commit
-
Pull overlayfs update from Miklos Szeredi:
- Copy up immutable/append/sync/noatime attributes (Amir Goldstein)
- Improve performance by enabling RCU lookup.
- Misc fixes and improvements
The reason this touches so many files is that the ->get_acl() method now
gets a "bool rcu" argument. The ->get_acl() API was updated based on
comments from Al and Linus:Link: https://lore.kernel.org/linux-fsdevel/CAJfpeguQxpd6Wgc0Jd3ks77zcsAv_bn0q17L3VNnnmPKu11t8A@mail.gmail.com/
* tag 'ovl-update-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: enable RCU'd ->get_acl()
vfs: add rcu argument to ->get_acl() callback
ovl: fix BUG_ON() in may_delete() when called from ovl_cleanup()
ovl: use kvalloc in xattr copy-up
ovl: update ctime when changing fileattr
ovl: skip checking lower file's i_writecount on truncate
ovl: relax lookup error on mismatch origin ftype
ovl: do not set overlay.opaque for new directories
ovl: add ovl_allow_offline_changes() helper
ovl: disable decoding null uuid with redirect_dir
ovl: consistent behavior for immutable/append-only inodes
ovl: copy up sync/noatime fileattr flags
ovl: pass ovl_fs to ovl_check_setxattr()
fs: add generic helper for filling statx attribute flags
31 Aug, 2021
2 commits
-
Callers of fuse_writeback_range() assume that the file is ready for
modification by the server in the supplied byte range after the call
returns.If there's a write that extends the file beyond the end of the supplied
range, then the file needs to be extended to at least the end of the range,
but currently that's not done.There are at least two cases where this can cause problems:
- copy_file_range() will return short count if the file is not extended
up to end of the source range.- FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE will not extend the file,
hence the region may not be fully allocated.Fix by flushing writes from the start of the range up to the end of the
file. This could be optimized if the writes are non-extending, etc, but
it's probably not worth the trouble.Fixes: a2bc92362941 ("fuse: fix copy_file_range() in the writeback case")
Fixes: 6b1bdb56b17c ("fuse: allow fallocate(FALLOC_FL_ZERO_RANGE)")
Cc: # v5.2
Signed-off-by: Miklos Szeredi -
Pull fs hole punching vs cache filling race fixes from Jan Kara:
"Fix races leading to possible data corruption or stale data exposure
in multiple filesystems when hole punching races with operations such
as readahead.This is the series I was sending for the last merge window but with
your objection fixed - now filemap_fault() has been modified to take
invalidate_lock only when we need to create new page in the page cache
and / or bring it uptodate"* tag 'hole_punch_for_v5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
filesystems/locking: fix Malformed table warning
cifs: Fix race between hole punch and page fault
ceph: Fix race between hole punch and page fault
fuse: Convert to using invalidate_lock
f2fs: Convert to using invalidate_lock
zonefs: Convert to using invalidate_lock
xfs: Convert double locking of MMAPLOCK to use VFS helpers
xfs: Convert to use invalidate_lock
xfs: Refactor xfs_isilocked()
ext2: Convert to using invalidate_lock
ext4: Convert to use mapping->invalidate_lock
mm: Add functions to lock invalidate_lock for two mappings
mm: Protect operations adding pages to page cache with invalidate_lock
documentation: Sync file_operations members with reality
mm: Fix comments mentioning i_mutex
19 Aug, 2021
1 commit
-
Add a rcu argument to the ->get_acl() callback to allow
get_cached_acl_rcu() to call the ->get_acl() method in the next patch.Signed-off-by: Miklos Szeredi
18 Aug, 2021
1 commit
-
fuse_finish_open() will be called with FUSE_NOWRITE in case of atomic
O_TRUNC. This can deadlock with fuse_wait_on_page_writeback() in
fuse_launder_page() triggered by invalidate_inode_pages2().Fix by replacing invalidate_inode_pages2() in fuse_finish_open() with a
truncate_pagecache() call. This makes sense regardless of FOPEN_KEEP_CACHE
or fc->writeback cache, so do it unconditionally.Reported-by: Xie Yongji
Reported-and-tested-by: syzbot+bea44a5189836d956894@syzkaller.appspotmail.com
Fixes: e4648309b85a ("fuse: truncate pending writes on O_TRUNC")
Cc:
Signed-off-by: Miklos Szeredi
12 Aug, 2021
1 commit
-
Pick up some small dax cleanups that make some of Ira's follow on work
easier.
05 Aug, 2021
2 commits
-
Make it possible to create a new mount from a already working server.
Here's a detailed description of the problem from Jakob:
"The background for this question is occasional problems we see with our
fuse filesystem [1] and mount namespaces. On a usual client, we have
system-wide, autofs managed mountpoints. When a new mount namespace is
created (which can be done unprivileged in combination with user
namespaces), it can happen that a mountpoint is used inside the new
namespace but idle in the root mount namespace. So autofs unmounts the
parent, system-wide mountpoint. But the fuse module stays active and
still serves mountpoint in the child mount namespace. Because the fuse
daemon also blocks other system wide resources corresponding to the
mountpoint, this situation effectively prevents new mounts until the
child mount namespaces closes.[1] https://github.com/cvmfs/cvmfs"
Reported-by: Jakob Blomer
Signed-off-by: Miklos Szeredi -
Affected call chains:
fuse_get_tree
-> get_tree_(bdev|nodev)
-> fuse_fill_superNeeded for following patch.
Signed-off-by: Miklos Szeredi
04 Aug, 2021
3 commits
-
Checking whether the "fd=", "rootmode=", "user_id=" and "group_id=" mount
options are present can be moved from fuse_get_tree() into
fuse_fill_super() where the value of the options are consumed.This relaxes semantics of reusing a fuse blockdev mount using the device
name. Before this patch presence of these options were enforced but values
ignored, after this patch these options are completely ignored in this
case.Signed-off-by: Miklos Szeredi
-
Naming convention under fs/fuse/:
struct fuse_conn *fc;
struct fs_context *fsc;Signed-off-by: Miklos Szeredi
-
There is a potential race between fuse_read_interrupt() and
fuse_request_end().TASK1
in fuse_read_interrupt(): delete req->intr_entry (while holding
fiq->lock)TASK2
in fuse_request_end(): req->intr_entry is empty -> skip fiq->lock
wake up TASK3TASK3
request is freedTASK1
in fuse_read_interrupt(): dereference req->in.h.unique ***BAM***Fix by always grabbing fiq->lock if the request was ever interrupted
(FR_INTERRUPTED set) thereby serializing with concurrent
fuse_read_interrupt() calls.FR_INTERRUPTED is set before the request is queued on fiq->interrupts.
Dequeing the request is done with list_del_init() but FR_INTERRUPTED is not
cleared in this case.Reported-by: lijiazi
Signed-off-by: Miklos Szeredi
13 Jul, 2021
1 commit
-
Use invalidate_lock instead of fuse's private i_mmap_sem. The intended
purpose is exactly the same. By this conversion we fix a long standing
race between hole punching and read(2) / readahead(2) paths that can
lead to stale page cache contents.CC: Miklos Szeredi
Reviewed-by: Miklos Szeredi
Signed-off-by: Jan Kara
08 Jul, 2021
1 commit
-
fuse_dax_mem_range_init() does not need the address or the pfn of the
memory requested in dax_direct_access(). It is only calling direct
access to get the number of pages.Remove the unused variables and stop requesting the kaddr and pfn from
dax_direct_access().Reviewed-by: Dan Williams
Signed-off-by: Ira Weiny
Reviewed-by: Vivek Goyal
Link: https://lore.kernel.org/r/20210525172428.3634316-2-ira.weiny@intel.com
Signed-off-by: Dan Williams
07 Jul, 2021
1 commit
-
Pull fuse updates from Miklos Szeredi:
- Fixes for virtiofs submounts
- Misc fixes and cleanups
* tag 'fuse-update-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
virtiofs: Fix spelling mistakes
fuse: use DIV_ROUND_UP helper macro for calculations
fuse: fix illegal access to inode with reused nodeid
fuse: allow fallocate(FALLOC_FL_ZERO_RANGE)
fuse: Make fuse_fill_super_submount() static
fuse: Switch to fc_mount() for submounts
fuse: Call vfs_get_tree() for submounts
fuse: add dedicated filesystem context ops for submounts
virtiofs: propagate sync() to file server
fuse: reject internal errno
fuse: check connected before queueing on fpq->io
fuse: ignore PG_workingset after stealing
fuse: Fix infinite loop in sget_fc()
fuse: Fix crash if superblock of submount gets killed early
fuse: Fix crash in fuse_dentry_automount() error path
04 Jul, 2021
1 commit
-
Pull iov_iter updates from Al Viro:
"iov_iter cleanups and fixes.There are followups, but this is what had sat in -next this cycle. IMO
the macro forest in there became much thinner and easier to follow..."* 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits)
csum_and_copy_to_pipe_iter(): leave handling of csum_state to caller
clean up copy_mc_pipe_to_iter()
pipe_zero(): we don't need no stinkin' kmap_atomic()...
iov_iter: clean csum_and_copy_...() primitives up a bit
copy_page_from_iter(): don't need kmap_atomic() for kvec/bvec cases
copy_page_to_iter(): don't bother with kmap_atomic() for bvec/kvec cases
iterate_xarray(): only of the first iteration we might get offset != 0
pull handling of ->iov_offset into iterate_{iovec,bvec,xarray}
iov_iter: make iterator callbacks use base and len instead of iovec
iov_iter: make the amount already copied available to iterator callbacks
iov_iter: get rid of separate bvec and xarray callbacks
iov_iter: teach iterate_{bvec,xarray}() about possible short copies
iterate_bvec(): expand bvec.h macro forest, massage a bit
iov_iter: unify iterate_iovec and iterate_kvec
iov_iter: massage iterate_iovec and iterate_kvec to logics similar to iterate_bvec
iterate_and_advance(): get rid of magic in case when n is 0
csum_and_copy_to_iter(): massage into form closer to csum_and_copy_from_iter()
iov_iter: replace iov_iter_copy_from_user_atomic() with iterator-advancing variant
[xarray] iov_iter_npages(): just use DIV_ROUND_UP()
iov_iter_npages(): don't bother with iterate_all_kinds()
...
30 Jun, 2021
2 commits
-
These functions implement the address_space ->set_page_dirty operation and
should live in pagemap.h, not mm.h so that the rest of the kernel doesn't
get funny ideas about calling them directly.Link: https://lkml.kernel.org/r/20210615162342.1669332-7-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle)
Reviewed-by: Christoph Hellwig
Cc: Al Viro
Cc: Dan Williams
Cc: Greg Kroah-Hartman
Cc: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use __set_page_dirty_no_writeback() instead. This will set the dirty bit
on the page, which will be used to avoid calling set_page_dirty() in the
future. It will have no effect on actually writing the page back, as the
pages are not on any LRU lists.[akpm@linux-foundation.org: export __set_page_dirty_no_writeback() to modules]
Link: https://lkml.kernel.org/r/20210615162342.1669332-6-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle)
Cc: Al Viro
Cc: Christoph Hellwig
Cc: Dan Williams
Cc: Greg Kroah-Hartman
Cc: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
22 Jun, 2021
4 commits
-
Fix some spelling mistakes in comments:
refernce ==> reference
happnes ==> happens
threhold ==> threshold
splitted ==> split
mached ==> matchedSigned-off-by: Zheng Yongjun
Signed-off-by: Miklos Szeredi -
Replace open coded divisor calculations with the DIV_ROUND_UP kernel macro
for better readability.Signed-off-by: Wu Bo
Signed-off-by: Miklos Szeredi -
Server responds to LOOKUP and other ops (READDIRPLUS/CREATE/MKNOD/...)
with ourarg containing nodeid and generation.If a fuse inode is found in inode cache with the same nodeid but different
generation, the existing fuse inode should be unhashed and marked "bad" and
a new inode with the new generation should be hashed instead.This can happen, for example, with passhrough fuse filesystem that returns
the real filesystem ino/generation on lookup and where real inode numbers
can get recycled due to real files being unlinked not via the fuse
passthrough filesystem.With current code, this situation will not be detected and an old fuse
dentry that used to point to an older generation real inode, can be used to
access a completely new inode, which should be accessed only via the new
dentry.Note that because the FORGET message carries the nodeid w/o generation, the
server should wait to get FORGET counts for the nlookup counts of the old
and reused inodes combined, before it can free the resources associated to
that nodeid.Signed-off-by: Amir Goldstein
Signed-off-by: Miklos Szeredi -
The current fuse module filters out fallocate(FALLOC_FL_ZERO_RANGE)
returning -EOPNOTSUPP. libnbd's nbdfuse would like to translate
FALLOC_FL_ZERO_RANGE requests into the NBD command
NBD_CMD_WRITE_ZEROES which allows NBD servers that support it to do
zeroing efficiently.This commit treats this flag exactly like FALLOC_FL_PUNCH_HOLE.
A way to test this, requiring fuse >= 3, nbdkit >= 1.8 and the latest
nbdfuse from https://gitlab.com/nbdkit/libnbd/-/tree/master/fuse is to
create a file containing some data and "mirror" it to a fuse file:$ dd if=/dev/urandom of=disk.img bs=1M count=1
$ nbdkit file disk.img
$ touch mirror.img
$ nbdfuse mirror.img nbd://localhost &(mirror.img -> nbdfuse -> NBD over loopback -> nbdkit -> disk.img)
You can then run commands such as:
$ fallocate -z -o 1024 -l 1024 mirror.img
and check that the content of the original file ("disk.img") stays
synchronized. To show NBD commands, export LIBNBD_DEBUG=1 before
running nbdfuse. To clean up:$ fusermount3 -u mirror.img
$ killall nbdkitSigned-off-by: Richard W.M. Jones
Signed-off-by: Miklos Szeredi