26 Apr, 2006
3 commits
-
BKL does not protect against races if the task may sleep between
checking and setting a value. So move checking of file->private_data
near to setting it in fuse_fill_super().Found by Al Viro.
Signed-off-by: Miklos Szeredi
-
A deadlock was possible, when the last reference to the superblock was
held due to a background request containing a file reference.Releasing the file would release the vfsmount which in turn would
release the superblock. Since sbput_sem is held during the fput() and
fuse_put_super() tries to acquire this same semaphore, a deadlock
results.The solution is to move the fput() outside the region protected by
sbput_sem.Signed-off-by: Miklos Szeredi
-
This reverts 73ce8355c243a434524a34c05cc417dd0467996e commit.
It was wrong, because it didn't take into account the requirement,
that iput() for background requests must be performed synchronously
with ->put_super(), otherwise active inodes may remain after unmount.The right solution is to keep the sbput_sem and perform iput() within
the locked region, but move fput() outside sbput_sem.Signed-off-by: Miklos Szeredi
12 Apr, 2006
4 commits
-
It's cleaner to allocate a new request, otherwise the uid/gid/pid
fields of the request won't be filled in.Signed-off-by: Miklos Szeredi
-
Request is already initialized in fuse_request_alloc() so no need to
do it again in fuse_get_req().Signed-off-by: Miklos Szeredi
-
Properly accounting the number of waiting requests was forgotten in
"clean up request accounting" patch.Signed-off-by: Miklos Szeredi
-
A deadlock was possible, when the last reference to the superblock was
held due to a background request containing a file reference.Releasing the file would release the vfsmount which in turn would
release the superblock. Since sbput_sem is held during the fput() and
fuse_put_super() tries to acquire this same semaphore, a deadlock
results.The chosen soltuion is to get rid of sbput_sem, and instead use the
spinlock to ensure the referenced inodes/file are released only once.
Since the actual release may sleep, defer these outside the locked
region, but using local variables instead of the structure members.This is a much more rubust solution.
Signed-off-by: Miklos Szeredi
11 Apr, 2006
9 commits
-
The previous patch removed limiting the number of outstanding requests. This
patch adds a much simpler limiting, that is also compatible with file locking
operations.A task may have at most one synchronous request allocated. So these requests
need not be otherwise limited.However the number of background requests (release, forget, asynchronous
reads, interrupted requests) can grow indefinitely. This can be used by a
malicous user to cause FUSE to allocate arbitrary amounts of unswappable
kernel memory, denying service.For this reason add a limit for the number of background requests, and block
allocations of new requests until the number goes bellow the limit.Also use this mechanism to block all requests until the INIT reply is
received.Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
FUSE allocated most requests from a fixed size pool filled at mount time.
However in some cases (release/forget) non-pool requests were used. File
locking operations aren't well served by the request pool, since they may
block indefinetly thus exhausting the pool.This patch removes the request pool and always allocates requests on demand.
Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Return consistent error values for the case when the opened device file has no
mount associated yet.Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove the global spinlock in favor of a per-mount one.
This patch is basically find & replace. The difficult part has already been
done by the previous patch.Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This is in preparation for removing the global spinlock in favor of a
per-mount one.The only critical part is the interaction between fuse_dev_release() and
fuse_fill_super(): fuse_dev_release() must see the assignment to
file->private_data, otherwise it will leak the reference to fuse_conn.This is ensured by the fput() operation, which will synchronize the assignment
with other CPU's that may do a final fput() soon after this.Also redundant locking is removed from fuse_fill_super(), where exclusion is
already ensured by the BKL held for this function by the VFS.Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
I don't like duplicating the connected and list_empty tests in fuse_dev_readv,
but this seemed cleaner than adding the f_flags test to request_wait.Signed-off-by: Jeff Dike
Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This adds asynchronous notification to FUSE - a FUSE server can request
O_ASYNC on a /dev/fuse file descriptor and receive SIGIO when there is input
available.One subtlety - fuse_dev_fasync, which is called when O_ASYNC is requested,
does no locking, unlink the other methods. I think it's unnecessary, as the
fuse_conn.fasync list is manipulated only by fasync_helper and kill_fasync,
which provide their own locking. It would also be wrong to use the fuse_lock,
as it's a spin lock and fasync_helper can sleep. My one concern with this is
the fuse_conn going away underneath fuse_dev_fasync - sys_fcntl takes a
reference on the file struct, so this seems not to be a problem.Signed-off-by: Jeff Dike
Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
fuse_dev_poll() returned an error value instead of a poll mask. Luckily (or
unluckily) -ENODEV does contain the POLLERR bit.There's also a race if filesystem is unmounted between fuse_get_conn() and
spin_lock(), in which case this event will be missed by poll().Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
During heavy parallel filesystem activity it was possible to Oops the kernel.
The reason is that read_cache_pages() could skip pages which have already been
inserted into the cache by another task. Occasionally this may result in zero
pages actually being sent, while fuse_send_readpages() relies on at least one
page being in the request.So check this corner case and just free the request instead of trying to send
it.Reported and tested by Konstantin Isakov.
Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
29 Mar, 2006
1 commit
-
This is a conversion to make the various file_operations structs in fs/
const. Basically a regexp job, with a few manual fixupsThe goal is both to increase correctness (harder to accidentally write to
shared datastructures) and reducing the false sharing of cachelines with
things that get dirty in .data (while .rodata is nicely read only and thus
cache clean)Signed-off-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
01 Mar, 2006
1 commit
-
If negative entries (nodeid == 0) were sent in reply to LOOKUP requests,
two bugs could be triggered:- looking up a negative entry would return -EIO,
- revaildate on an entry which turned negative would send a FORGET
request with zero nodeid, which would cause an abort() in the
library.The above would only happen if the 'negative_timeout=N' option was used,
otherwise lookups reply -ENOENT, which worked correctly.Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
18 Feb, 2006
1 commit
-
There's a rather theoretical case of the BUG triggering in
fuse_reset_request():- iget() fails because of OOM after a successful CREATE_OPEN request
- during IO on the resulting RELEASE request the connection is abortedFix and add warning to fuse_reset_request().
Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
06 Feb, 2006
1 commit
-
The last fix for this function in fact opened up a much more often
triggering race.It was uncommented tricky code, that was buggy. Add comment, make it less
tricky and fix bug.Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
02 Feb, 2006
1 commit
-
While asynchronous reads mean a performance improvement in most cases, if
the filesystem assumed that reads are synchronous, then async reads may
degrade performance (filesystem may receive reads out of order, which can
confuse it's own readahead logic).With sshfs a 1.5 to 4 times slowdown can be measured.
There's also a need for userspace filesystems to know whether asynchronous
reads are supported by the kernel or not.To achive these, negotiate in the INIT request whether async reads will be
used and the maximum readahead value. Update interface version to 7.6If userspace uses a version earlier than 7.6, then disable async reads, and
set maximum readahead value to the maximum read size, as done in previous
versions.Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
17 Jan, 2006
16 commits
-
Fix race in setting bitfields of fuse_conn. Spotted by Andrew Morton.
The two fields ->connected and ->mounted were always changed with the
fuse_lock held. But other bitfields in the same structure were changed
without the lock. In theory this could lead to losing the assignment of
even the ones under lock. The chosen solution is to change these two
fields to be a full unsigned type. The other bitfields aren't "important"
enough to warrant the extra complexity of full locking or changing them to
bitops.For all bitfields document why they are safe wrt. concurrent
assignments.Also make the initialization of the 'num_waiting' atomic counter explicit.
Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This patch changes fuse_readpages() to send READ requests asynchronously.
This makes it possible for userspace filesystems to utilize the kernel
readahead logic instead of having to implement their own (resulting in double
caching).Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add a separate function for filling in the READ request. This will make it
possible to send asynchronous READ requests as well as synchronous ones.Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Now the INIT requests can be completely handled in inode.c and the
fuse_send_init() function need not be global any more.Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add possibility for requests to run asynchronously and call an 'end' callback
when finished.With this, the special handling of the INIT and RELEASE requests can be
cleaned up too.Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add ability to abort a filesystem connection.
With the introduction of asynchronous reads, the ability to interrupt any
request is not enough to dissolve deadlocks, since now waiting for the request
completion (page unlocked) is independent of the actual request, so in a
deadlock all threads will be uninterruptible.The solution is to make it possible to abort all requests, even those
currently undergoing I/O to/from userspace. The natural interface for this is
'mount -f mountpoint', but that only works as long as the filesystem is
attached. So also add an 'abort' attribute to the sysfs view of the
connection.Signed-off-by: Miklos Szeredi
Cc: Greg KH
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This patch adds the 'waiting' attribute which indicates how many filesystem
requests are currently waiting to be completed. A non-zero value without any
filesystem activity indicates a hung or deadlocked filesystem.Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Kobjectify fuse_conn, and make it visible under /sys/fs/fuse/connections.
Lacking any natural naming, connections are numbered.
This patch doesn't add any attributes, just the infrastructure.
Signed-off-by: Miklos Szeredi
Cc: Greg KH
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The ->connected flag for a fuse_conn object previously only indicated whether
the device file for this connection is currently open or not.Change it's meaning so that it indicates whether the connection is active or
not: now either umount or device release will clear the flag.The separate ->mounted flag is still needed for handling background requests.
Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Create a new list for requests in the process of being transfered to/from
userspace. This will be needed to be able to abort all requests even those
currently under I/OSigned-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The state of request was made up of 2 bitfields (->sent and ->finished) and of
the fact that the request was on a list or not.Unify this into a single state field.
Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
- remove some unneeded assignments
- use kzalloc instead of kmalloc + memset
- simplify setting sb->s_fs_info
- in fuse_send_init() use fuse_get_request() instead of
do_get_request() helperSigned-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Inline keyword is unnecessary in most cases. Clean them up.
Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Handle the case when the INIT request is answered with an error.
Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This function used the request object after decrementing its reference count
and releasing the lock. This could in theory lead to all sorts of problems.Fix and simplify at the same time.
Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
fuse_copy_finish() must be called before request_end(), since the later might
sleep, and no sleeping is allowed between fuse_copy_one() and
fuse_copy_finish() because of kmap_atomic()/kunmap_atomic() used in them.Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
10 Jan, 2006
1 commit
-
This patch converts the inode semaphore to a mutex. I have tested it on
XFS and compiled as much as one can consider on an ia64. Anyway your
luck with it might be different.Modified-by: Ingo Molnar
(finished the conversion)
Signed-off-by: Jes Sorensen
Signed-off-by: Ingo Molnar
07 Jan, 2006
2 commits
-
Previously invalid types were quietly changed to regular files, but at
revalidation the inode was changed to bad. This was rather inconsistent
behavior.Now check if the type is valid on initial lookup, and return -EIO if not.
Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
In direct_io mode, send at least one page per reqest. Previously it was
possible that reqests with zero data were sent, and hence the read/write
didn't make any progress, resulting in an infinite (though interruptible)
loop.Signed-off-by: Miklos Szeredi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds